Loading
0 Follower
0 Following
bwitherspoon
Brett Witherspoon

Location

US

Badges

2
0
0

Activity

Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Mon
Wed
Fri

Ratings Progression

Loading...

Challenge Categories

Loading...

Challenges Entered

Multi-Agent Dynamics & Mixed-Motive Cooperation

Latest submissions

No submissions made in this challenge.

Machine Learning for detection of early onset of Alzheimers

Latest submissions

No submissions made in this challenge.

Multi-Agent Reinforcement Learning on Trains

Latest submissions

No submissions made in this challenge.

Multi-agent RL in game environment. Train your Derklings, creatures with a neural network brain, to fight for you!

Latest submissions

See All
graded 125804
failed 125707
graded 124059

Multi Agent Reinforcement Learning on Trains.

Latest submissions

No submissions made in this challenge.

5 Problems 15 Days. Can you solve it all?

Latest submissions

No submissions made in this challenge.

Multi-Agent Reinforcement Learning on Trains

Latest submissions

See All
graded 117867
graded 117431
graded 117429
Participant Rating
Participant Rating
bwitherspoon has not joined any teams yet...

Dr. Derks Mutant Battlegrounds

Derk3: Open submission and potential baseline

Over 3 years ago

I have made significant improvements to this baseline and have pushed them to the β€œdevelop” branch of the repository. I will merge it into master once I finish tuning and review my work.

It’s current state scores a very respectable 2.432 https://www.aicrowd.com/challenges/dr-derks-mutant-battlegrounds/submissions/123185

If you cant wait for me to finish you can go ahead and checkout the develop branch, but expect it to be changing quickly. I will update the original post with some of the details of the changes.

Derk3: Open submission and potential baseline

Over 3 years ago

This project needs to be updated for recent changes in the gym and the competition. I will update it.

Clarify evaluation points

Over 3 years ago

The main page says you get 4 points per opponent Derkling you kill and 13 points per statue, but the evaluation appears to be using the default reward function for the gym which is 4 points per statue and 1 point per Derkling.

There is a misleading display of points in the replay which appears to be hardcoded in the game, but the true reward function is shown in the small boxes and these values I believe are reflected in the scores.

Also the equation for the scoring appears to be a placeholder. I think you should also mention that the score is averaged over 128 games.

Finally, it would be less confusing if the the reward function used in the starter kit run.py is the actual reward function used for evaluation.

Challenge announcement | GPU submissions, build & run logs, and more

Over 3 years ago

Some of the score variation may be on our end as mentioned. I might test by fixing a random set of items.

Challenge announcement | GPU submissions, build & run logs, and more

Over 3 years ago

Yes, if you submit the same submission twice you get different results (usually they are similar). I would expect this given the random item generation. However, I would expect averaging over many games would negate this effect.

Maybe 128 games is not enough to average over. Maybe we should try more?

Another possible solution would be to generate a secret random set items and keep it fixed during evaluation. There may be some other sources of randomness though.

Having a symmetric evaluation is good idea too, especially, if the secret set of items are kept fixed during evaluation. However, I also expect many random trails to have a similar effect.

Challenge announcement | GPU submissions, build & run logs, and more

Over 3 years ago

The change in random item selection seems to have had some unintended consequences. Submissions trained and submitted on previous versions of the environment can no longer repeat the leaderboard score. Then for new submissions the distributions of scores has changed.

For example, my submission developed during the warm-up round (and resubmitted for the first round) consistently scored ~2.6 (https://www.aicrowd.com/challenges/dr-derks-mutant-battlegrounds/submissions/116018). However, if resubmitted now scores around ~1.6 (https://www.aicrowd.com/challenges/dr-derks-mutant-battlegrounds/submissions/121655).

Then even submissions trained and evaluated on the new environment typically receive lower scores then before, so the distribution of scores has changed.

I am not complaining about that change. I think it was a good one. I am just saying the competition environment changed in the middle of the competition. The overview page for the competition says that round 1 ends Feb 15. Given the change in environment, if round 2 is not ready, wouldn’t it be a good idea to start a new round with a new leaderboard?

How to debug a failed submission

Over 3 years ago

This issue was solved in the discord channel. For those having a similar issue, the conclusion was that logs are not given when the image fails to build. The best way to debug a failed build is to attempt to build the image yourself locally: Which docker image is used for my submissions?

In this case there was just a missing dependency in apt.txt.

Random items generation question

Almost 4 years ago

The most recent version of gym-derk was updated on 1/21, I assume, in response to this issue. It tweaked how the items are assigned randomly. But the details are now in the documentation.

Why wasn’t there any communication with us? This appears to be a pattern.

Random items generation question

Almost 4 years ago

I am not sure if you can determine the items given visually.

According to the documentation there are 7 items for the arms slot, 5 items for the misc slot, and 3 items for the tail and some of these can be empty (not sure if the random selection does or not), but definitely shouldn’t have more then one item per slot.

What I am concerned about is the training time and network capacity required to learn all these capabilities. That is going to put this competition out of reach for the average person without a big GPU and/or computing time, which I really think is not in the spirit of this competition.

Random items generation question

Almost 4 years ago

That’s a good approach too and could be really good if the combinations where designed well.

Does the current randomization allow duplicate items?

If the items are chosen without replacement, currently there are 15 choose 3 = 455 combinations per agent.
But then the problem depends on what is given to your whole team and then what is given to the opposing team…

If freely choosing items I would hope there would not be some really dominate combination.

Random items generation question

Almost 4 years ago

I don’t know about not receiving all 3 items, but I have noticed that the random generation will sometimes result in a really unfair game.

I asked on discord if the evaluation score was averaged over a large number arenas, which might mitigate the issue, but never received a response.

I agree that the competition would be better and the problem much more interesting if you could choose the items (ideally learning to choose the items). The environment supports that (not on reset though).

Derk3: Open submission and potential baseline

Almost 4 years ago

The most recent submission (after some additional training) is actually here.

Derk3: Open submission and potential baseline

Almost 4 years ago

Derk3: Open submission and potential baseline

I have made an open submission that could serve as a baseline for anyone wanting to get bootstrapped into the competition.

There are pre-trained weights in the repository which have a decent score in the last submission.

The baseline implementation is intentionally minimal, but the base algorithm (PPO) is fairly advanced and very popular. There are many opportunities for someone to extend the algorithm or architecture. It could also use some hyperparameter tuning, reward function shaping, and a well designed training procedure. Additional information and some possible directions for improvement can be found in the project README.md.

I will provide additional information on the details if there is interest. As other participants have higher scoring submissions this baseline implementation will be also be enhanced. Please consider sharing your extensions or at least a comparison to this baseline with the community.

bwitherspoon has not provided any information yet.