Location
Badges
Activity
Ratings Progression
Challenge Categories
Challenges Entered
Multi-Agent Reinforcement Learning on Trains
Latest submissions
See Allgraded | 97850 | ||
graded | 97836 | ||
graded | 97794 |
Robots that learn to interact with the environment autonomously
Latest submissions
Multi Agent Reinforcement Learning on Trains.
Latest submissions
Multi-Agent Reinforcement Learning on Trains
Latest submissions
Participant | Rating |
---|---|
student | 271 |
MasterScrat | 229 |
vrv | 0 |
chen_hao_1 | 0 |
Participant | Rating |
---|
-
MasterFlatland FlatlandView
Flatland
π Round 1 has finished, Round 2 is starting soon!
About 4 years agoYes, I feel current agent number is large enoughβ¦
It seems that generating large env is very slow, it may be a problem for large envβs offline RL trainingβ¦
π Round 1 has finished, Round 2 is starting soon!
About 4 years agoIn the case that some teams can solve all environments in 8 hours, is there a deadline for environment change in the Round 2?
I think it may be helpful to keep the env unchanged for the last 3+ weeks, so that we can have time to finetune our algoriths, instead of searching for different directionsβ¦
π Round 1 has finished, Round 2 is starting soon!
About 4 years agoThanks @MasterScrat, looking forward to the Round 2.
Can you please share the environment specifications of Round 2, so that we can start to think about some possible directions?
π Round 1 has finished, Round 2 is starting soon!
About 4 years agoHi, @MasterScrat, any plan to start the Round 2?
ππ Train Close Following
About 4 years agoHey, is current Round 1 using master branchβs version, or pip release 2.2.1?
π Addressing Round 1 pain points
About 4 years ago(post withdrawn by author, will be automatically deleted in 24 hours unless flagged)
Submit both RL and OR method
About 4 years agoWhat happen if have a OR submission then a RL submission?
Which result will be showed on leaderboard? or both?
π Addressing Round 1 pain points
About 4 years agoWhen is the deadline of Round 1? Within 1 day? @MasterScrat
Team merging deadline
About 4 years agoHi, @MasterScrat, it seems that we cannot invite member to our team now?
π Addressing Round 1 pain points
About 4 years agoHi, @MasterScrat, thanks for the kind reply and explaination.
As there is no other teams(using RL) sharing simialr concerns with me, please moving forward.
π Addressing Round 1 pain points
About 4 years agoI may be wrong, but below is my feedback about adding many more evaluation episodes:
-
Currently RLβs complete rate is row even given current env settings. It may narrow the application of RL in order to compete with OR method.
-
It may ask us to focus more on OR method.
As I commented before, I think larger env is good, but itβs better to have much less test cases.
π§ Pain points in Round 1 and wishes for Round 2?
About 4 years agoThanks for the thread for dicussion.
As a participant who really interested in usng RL to solve this problem, my concerns are:
- Timing. When we use RL, likely we need to use GPU for inference. Unfortuntately, our GPU utilization should be low as it only serve one or a few states per batch. So I may expect that for larger grid size, RL with GPU is likely to be less efficient than OR method.
- Diversity of env. When we have 14 different size of grid, it makes our RL training harder. If we further consider different speeds, it may require more effort for deadlock free planning.
My wishes for Round 2 are:
- Use only a few large test cases(for example, # of test cases <= 10), while keep same overall running time. It may be even better to test with same grid size.
- Use same speed for different agents. I personally prefer to focus more on RL related things, instead of dealing with dead-lock from different speeds.
I think one of ORβs shortage is that itβs not straightforward to optimize for global reward.
My understanding: RLβs advantage is finding a better solution(combining with OR), but not acting in a shorter time.
If we want to see RL performan better than OR, we should give RL enough time for planning/inference on large grid env. (both 5 min and 5s may not be enough for RL to do planning and inference. )
Number of test cases and video of each submission
Over 4 years agoHow many test maps to generate the submission result?
After each submission, there will be a video for this submission. Is the video including all test cases?
Config of simulation environment during training and evaluation
Over 4 years agoAs you mentioned, small map size may be better with operations search.
I am not sure if there will be test cases with small map size?
If yes, then we may need to implement an operations search algorithm, along with RL algorithm.
My question is: will you limit the minimal map size? For example, larger than K x K, ensuring that most operations search algorithm can not solve the problem in time limit. So that we can focus more on real large map size.
Config of simulation environment during training and evaluation
Over 4 years agoThanks @MasterScrat for the quick reply.
I feel much clear with your reply.
Config of simulation environment during training and evaluation
Over 4 years agoThanks @MasterScrat for the kind reply.
May I know how much difference it may be between round 1 and round 2?
Consider the example with two different settings:
- when we just need our algorithm to work with map size 150 * 150
- when we also need our algorithm to work with map size 1500 * 1500
It may be quite different to design a optimal state/algorithm when the problem settings are different.
Conda env creation errors...UPDATED: later EOF error when running evaluator
Over 4 years agoI am using WSL2 with Ubuntu(16.04) and docker.
It works well so far.
For the visualization, I have tried two ways, both work for me:
- Install GUI and XServer for WSL2.
Some links I found helpful:
- https://medium.com/@dhanar.santika/installing-wsl-with-gui-using-vcxsrv-6f307e96fac0
- https://askubuntu.com/questions/1162808/run-ubuntu-desktop-on-wsl-ubuntu-18-04-lts
- After getting frames in png format, use the following function to generate a video:
https://gitlab.aicrowd.com/flatland/flatland/blob/master/flatland/evaluators/aicrowd_helpers.py#L108
Overall, I feel the 2nd method is simpler and I am currently using it for visualization.
Config of simulation environment during training and evaluation
Over 4 years agoFor RL to work well, itβs better to have similar configs between the simulation environment of training and evaluation.
To help properly setting up the training environment, can you provide some basic information in the evaluation environment?
For example, the range of the following settings:
- width and height of map
- num of trains
- num of cities
- type of city distribution
- speed ratio of trains
- max rails between cities
- max rails in cities
- type of schedule generator
- malfunction: rate, min/max duration.
Conda env creation errors...UPDATED: later EOF error when running evaluator
Over 4 years agoSame here.
In windows, I can only install the environment via: pip install flatland-rl
However, it failed to run the evaluator, error same as in MemoAIβs post. (β¦ EOFError: Ran out of input)
π Round 1 has finished, Round 2 is starting soon!
About 4 years agoHow can we use the latest Flatland environemnt, from master branchβs version, or pip release 2.x.x?
(Round 1 was using flatland-rl==2.2.1)