Activity
Ratings Progression
Challenge Categories
Challenges Entered
Behavioral Representation Learning from Animal Poses.
Latest submissions
See Allgraded | 182254 | ||
graded | 182252 | ||
graded | 179065 |
What data should you label to get the most value for your money?
Latest submissions
ASCII-rendered single-player dungeon crawl game
Latest submissions
See Allgraded | 158823 | ||
failed | 158209 | ||
failed | 158208 |
Airborne Object Tracking Challenge
Latest submissions
Machine Learning for detection of early onset of Alzheimers
Latest submissions
Measure sample efficiency and generalization in reinforcement learning using procedurally generated environments
Latest submissions
Self-driving RL on DeepRacer cars - From simulation to real world
Latest submissions
See Allgraded | 165209 | ||
failed | 165208 | ||
failed | 165206 |
Robustness and teamwork in a massively multiagent environment
Latest submissions
5 Puzzles 21 Days. Can you solve it all?
Latest submissions
Multi-Agent Reinforcement Learning on Trains
Latest submissions
Latest submissions
See Allgraded | 143804 | ||
graded | 125756 | ||
graded | 125751 |
Learn to Recognise New Behaviors from limited training examples.
Latest submissions
See Allgraded | 125756 | ||
graded | 125589 |
Reinforcement Learning, IIT-M, assignment 1
Latest submissions
See Allgraded | 125767 | ||
submitted | 125747 | ||
graded | 125006 |
IIT-M, Reinforcement Learning, DP, Taxi Problem
Latest submissions
See Allgraded | 125767 | ||
graded | 125006 | ||
graded | 124921 |
Latest submissions
See Allgraded | 128400 | ||
submitted | 128365 |
Latest submissions
See Allfailed | 131869 | ||
graded | 130090 | ||
graded | 128401 |
Latest submissions
See Allfailed | 131869 | ||
graded | 130090 | ||
graded | 128401 |
Latest submissions
See Allgraded | 135842 | ||
graded | 130545 |
Round 1 - Completed
Latest submissions
Round 1 - Completed
Latest submissions
Identify Words from silent video inputs.
Latest submissions
Round 2 - Active | Claim AWS Credits by beating the baseline
Latest submissions
See Allgraded | 182252 | ||
graded | 178951 | ||
graded | 178941 |
Round 2 - Active | Claim AWS Credits by beating the baseline
Latest submissions
See Allgraded | 182254 | ||
graded | 179065 |
Participant | Rating |
---|---|
![]() |
0 |
![]() |
0 |
Participant | Rating |
---|
-
Random-walk Airborne Object Tracking ChallengeView
Multi Agent Behavior Challenge 2022
Round 2 Deadlines updated π
25 days agoHi Everyone,
Happy to announce that weβve extended Round 2 by 1 and a half months after receiving your feedback that the current timeline was too short to work with video data. Looking forward to your solutions!
Round 2 now ends on 3rd July 2022.
Data Purchasing Challenge 2022
π Share your solutions! π
About 1 month agoHi Everyone,
Thank you for participating in the Data Purchasing Challenge, it has been a unique journey for this one-of-a-kind challenge. Weβre excited to know your ideas and solutions. Regardless of whether you won, any ideas you share on the discussion forum are highly appreciated.
Sharing your solutions also helps you reflect upon your learnings throughout the competition.
Iβll summarize all the solutions that are shared into this post, as Iβm sure theyβll be tremendously useful for participants of future Data Purchasing Challanges (yes there will be more )
Please also share failed ideas, as all machine learning afficianados know, negative samples are just as important.
Looking forward to your solutions!
[Announcement] Leaderboard Winners
About 1 month agoHi @xiaozhou_wang, glad youβll be sharing your solutions. Congratulations on bagging the top spot. As @Camaro suggested, please make a discourse post, and you can also make your submissions repository on AIcrowd Gitlab public if you want to share your full submissions journey.
:rotating_light: Select submissions for final evaluation
About 1 month agoHi @chuifeng
No, all the runs are completed. The leaderboard is final. The best scoring submission of each participant from the full runs was used. The βsuccessful entriesβ column is generic feature of the leaderboard, since leaderboard was calculated offline and the final results added separately, it shows 1.
Please check this sheet with the full scores of all the runs.
:rotating_light: Select submissions for final evaluation
About 2 months agoHi @leocd
The highest scoring among the selected submissions will be used.
:rotating_light: Select submissions for final evaluation
About 2 months agoHi everyone,
Thanks for participating in the Data Purchasing Challenge!
We hope you had fun with the problem and the dataset. In this next phase, we will be evaluating your selected submissions on the full hidden dataset.
ββΉοΈ Know more about it here:
- https://discourse.aicrowd.com/t/important-details-about-end-of-competition-evaluations/7512
- https://discourse.aicrowd.com/t/code-for-end-of-competition-training-pipelines/7547
Deadline for filling this form is 08 April 2022, 23:59 UTC (earlier the better, thanks
)
Select Submission ID for Full Evaluation
Code for End of Competition Training pipelines
About 2 months agoHi @sergey_zlobin , all the compute budget pairs will be run with the purchase phase first. Then the post purchase evaluations will be run on each of the purchased sets. As is already the case, there is no time limit on the post purchase training as its entirely controlled by us.
Code for End of Competition Training pipelines
About 2 months agoHi Everyone,
Below are the links to the end of competition training pipelines that will be used for the final winner selection. Each pipeline will be run for 2 seeds. The code release is for your feedback, so please feel free to go through them.
In case you have no idea what Iβm talking about, please check this post.
If there are any critical issues weβll fix them.
I tried many different combinations and rejected some if the gap between random and all label purchase was too low, apologies if your favourite combination didnβt make it.
Check the commit history easily navigating the changes.
IMPORTANT: Details about end of competition evaluations π―
2 months ago@Camaro Probably no extension as far as I know. Will inform in case of any changes.
:aicrowd: [Update] Round 2 of Data Purchasing Challenge is now live!
2 months agoYes what you say makes sense, although on the other end a very strong model was getting nearly as good as βall label purchaseβ scores with just random purchase, so the dataset needed more difficulty, important lessons learnt. In any case, I agree with your definition of useful, for now weβve come up with the end of competition evaluations scheme. Please check this recent post.
IMPORTANT: Details about end of competition evaluations π―
2 months agoHi Everyone,
Iβve seen multiple peopleβs posts suggesting changes in the purchase phase training pipeline. This post is to clarify some details about the end of competition evaluations and explain some its implications.
First of all, thanks a lot to everyone who have provided feedback for the training pipeline. It has been very valuable to us for the challenge design.
TL;Dr - Please submit your best purchase strategies, youβll need to select them for end of competition evaluations which will run 5 post purchase training pipelines.
Current problems
Many have pointed out that the training pipeline does not provide good scores. Iβll break these down some categories.
-
Concerns about the best purchases not being incentivised correctly
- Model is too weak and cannot learn hard examples - This is a legitimate concern, though I do not currently know the scale to which it is applicable. Weβre investigating this.
- Not using GaussianBlur in testing changes the optimal labels - Currently I have not found any strong evidence of this, but would love to discuss further.
- Model does not converge due to low epochs hence too stochastic - I did not find the stochasticity in scores to be too much and feel it is in expected practical limits. However, in the later part of this post Iβll be addressing this further.
These are the most important concerns weβre looking to resolve. Some of them are also difficult to measure, and the discussion becomes qualitative and opinionated. We want to resolve it in the most quantitative and fair way possible.
-
Concerns that the score is too low
- The feature layers are frozen
- The model is too small and plateaus
- GaussianBlur is not used in test
For these, Iβd like to reiterate that the score is not important, rather, maximising the score by making the best purchase strategies is the goal of the competition.
End of competition Evaluations
In addition to changing the dataset for end of competition evaluations. We will run select submissions on multiple training pipelines in the post purchase phase.
The detailed steps are given below:
- Eligible teams will select two of their submissions to evaluate - Eligibility criteria to be announced soon, it will be based on Round 2 leaderboard.
- Each submission will run through the pre-train and the purchase phase on the end of competition dataset.
- The same purchased labels will be put through 5 training pipelines - Details to be released soon.
- Each training pipeline will be run for 2 seeds and scores averaged, to address any stochasticity in scores.
- To avoid issues due to difference of average scores from different training pipelines, a Borda ranking system will be used.
We hope this will inceltivize participants to select the best purchase strategies and not optimize for the current training pipeline. Weβre unable to incorporate this setup during the live round due to the prohibitive cost of compute for each run.
Training pipeline survey
Please vote here for your favoured schemes. Note that is not a vote to select the training pipelines, just a survey of participantβs preferences, but weβll take the results very seriously.
- Unfreeze feature layers in base model
- Use Gaussian Blur During Test
- Train for more epochs
- Use bigger model - Efficientnet-B7
- Train more epochs + Unfreeze feature layers in base model
- Train more epochs + Use Gaussian Blur During Test
- Remove Gaussian Blur from training
- Unfreeze feature layers in base model + Remove Gaussian Blur from training
0 voters
Please feel free to suggest any other changes if I have missed them.
:aicrowd: [Update] Round 2 of Data Purchasing Challenge is now live!
2 months agoHi @tfriedel
Seems I missed your message. The point on ruling out the most useful images, do you feel its a huge issue with the current training pipeline?
We do want the solutions to be as agnostic to the training pipeline while buying the best images possible. Yes its not completely possible to make things training agnostic, but that is the spirit of the competition weβd like to promote. If youβre finding that youβre deliberately having to remove too much of the useful images, please let me know.
Baseline Released :gift:
2 months agoWe hope you have been following the development around the challenge.
Today, we are finally releasing a baseline, which was promised in the last townhall.
Did you miss the townhall?
You can check about it here: πΉ Town Hall Recording & Resources from top participants
About the Baseline
The baseline contains fast heuristic implementations of some simple ideas.
- Purchase images with more labels - For multilabel datasets, often having images with more than one label gives a boost for deep learning models.
- Purchase uncertain images - Purchase images which have the most uncertainty in their predictions. While many methods exists to measure uncertainty, a simple output probability based heuristic method is used here.
- Purchase images to balance labels - Well balanced datasets can improve model performance in deep learning. We set a uniform target distribution and try to purchase labels to get closer to that distribution. The provided code can try to purchase labels to any target distribution.
Learn more about the implementation here.
Jump to the codebase directly here.
Looking for more resources to explore? Check out this thread.
Ok, talk is easy, how does it perform?
Well, as it stands, the baseline is currently on 5th position on the whole leaderboard.
Do you have additional questions?
Feel free to drop them in this thread and we can reply them asap.
What are you waiting for? Letβs get started with the submissions!
:aicrowd: [Update] Round 2 of Data Purchasing Challenge is now live!
2 months agoHi @Camaro
Thanks for the clarification and useful explanation. Iβll consider this and try some experiments. One issue with designing the challenge is it needs balance between good training pipeline, score gap for better data, and compute constraints. All these need to be satisfied while iterating our synthetic data to match these constraints. Weβll try to improve the training pipeline accordingly if weβre able to match them in a reasonable way.
:aicrowd: [Update] Round 2 of Data Purchasing Challenge is now live!
2 months agoHi @Camaro
I understand your concern about underfitting. However, this challenge is a bit non-traditional, the data is completely synthetic, and the purpose is solely the methods used for research only. The final model trained is not of importance to any real-world setting, only the algorithms you develop.
Also, the purpose of the training pipeline is to give a level playing field to all participants to that they focus the purchase strategies. Whether the final loss value is reached is not important as long as better data purchased produces better scores using the same training pipeline, which is what weβve tried to setup with Round 2.
Your goal is to improve the score by purchasing better labels, the score may be limited by the training setup once the best labels are purchased, but that is not yet the case in my opinion.
Why there is no GaussianBlur in test transform?
2 months agoHi @chuifeng
Yes the GaussianBlur in training should have been applied with a probability value during training to get higher scores. Or as you suggest, could be used during test which will also increase the score.
Though Iβd still like to stress that the gap in scores between random purchase and the all-label purchase is the criteria that is more important that the raw scores.
You can refer to the discussion here:
:aicrowd: [Update] Round 2 of Data Purchasing Challenge is now live!
3 months agoHi @tfriedel
Currently, I do not see this much spread in scores when testing on the private set which is used for leaderboard. Nevertheless, thanks for pointing this out, will check further on the score spread across the budgets used.
Also note that the final end of competition evaluations will not use the dataset used on the leaderboard now, but a different dataset sampled from the same distribution. The end of competition evalutions will also feature more exhaustive evaluations with many more models. Hence overfitting the leaderboard is likely to hurt participants when the final evaluations take place. Weβll make communications about this clearer in case this isnβt properly explained.
MABe 2022: Mouse Triplets
Evaluations are being re-run with balanced class weights
About 2 months agoHi Everyone,
As you might have recently got a e-mail for, the leaderboard is currently being re-run with balanced class weights. The scores might be unstable until all the runs finish. Please do not worry if you suddenly see your ranks jumping around too much. They should stabilize once all the runs are done.
Notebooks
-
Unsupervised model - SimCLR - Ant-Beetles Video Data Unsupervised model training using contrastive learning with modified SimCLRdipamΒ· About 1 month ago
-
Unsupervised model - SimCLR - Mouse Video Data Unsupervised model training using contrastive learning with modified SimCLRdipamΒ· About 1 month ago
-
Getting Started - Mouse-Triplets Video Data Initial data exploration and a basic embedding using a vision modeldipamΒ· About 2 months ago
-
Getting Started - Ant-Beetles Video Data Initial data exploration and a basic embedding using a vision modeldipamΒ· About 2 months ago
-
BSuite Challenge Starter Kit IITM RL Final Project Bsuite starter kit with random baselinedipamΒ· About 1 year ago
-
Solution for submission 128367 A detailed solution for submission 128367 submitted for challenge IIT-M RL-ASSIGNMENT-2-GRIDWORLDdipamΒ· About 1 year ago
-
Solution for submission 130090 A detailed solution for submission 130090 submitted for challenge IIT-M RL-ASSIGNMENT-2-GRIDWORLDdipamΒ· About 1 year ago
-
Solution for submission 128401 A detailed solution for submission 128401 submitted for challenge IIT-M RL-ASSIGNMENT-2-GRIDWORLDdipamΒ· About 1 year ago
-
Solution for submission 128400 A detailed solution for submission 128400 submitted for challenge IIT-M RL-ASSIGNMENT-2-TAXIdipamΒ· About 1 year ago
-
Taxi Notebook IITM RL Assignment 2 Notebook to be filled for IITM RL Assingnment 2 TaxidipamΒ· About 1 year ago
-
Gridworld Notebook IITM RL Assignment 2 Notebook to be filled for IITM RL Assingnment 2 GridworlddipamΒ· About 1 year ago
Scores of all tasks made public π‘
24 days agoHi Everyone,
Weβve decided to make the scores of all the tasks and borda ranks of each submission public. Now you can view the scores of all the tasks. Our original motivation was to prevent overfitting on the tasks in the spirit of the challenge being unsupervised representation learning. But we understand that its frustrating for participants to not have any source of feedback on the tasks and why one of their submissions does better than another.
P.S: The scores visible are on the public split of the data. (Ignore that the names say private). The scores on the private split will be available after the competition ends, and will be used for selecting the winners (No changes here)
P.P.S: Negative scores indicate MSE scores, these are made negative because the borda system requires βhigher is betterβ. So you can ignore the negative sign when looking at the MSE scores.
For those interested, hereβs a short description on how the selection system works (taken from this discussion from AIcrowd Discord)
The selection system we have works like this:
First your own submissions are borda ranked, the best among these is selected, then your submission is borda ranked against other top submissions from each team. This is done to prevent any team from having multiple entries to increase their rank gap based on a single task.
As a consequence of this, you are also competing against your submissions in the borda system. In this case, your new submissions may perform slightly better in average borda rank against your own submissions, because it can significantly outperform on a few of the hidden tasks.
Note that the above case may cause the a submission with lower average f1 score to be selected.