Location
Badges
Activity
Ratings Progression
Challenge Categories
Challenges Entered
A benchmark for image-based food recognition
Latest submissions
Using AI For Buildingβs Energy Management
Latest submissions
What data should you label to get the most value for your money?
Latest submissions
See Allfailed | 184202 | ||
graded | 179185 | ||
graded | 179000 |
Behavioral Representation Learning from Animal Poses.
Latest submissions
Classify images of snake species from around the world
Latest submissions
Participant | Rating |
---|
Participant | Rating |
---|
Data Purchasing Challenge 2022
Code for End of Competition Training pipelines
Over 2 years agoEach of 5 training pipilines will go with its own budget, right ?
:aicrowd: [Update] Round 2 of Data Purchasing Challenge is now live!
Over 2 years agoHi, it seems theresβs a bug in local_evaluation.py.
I think you should change
time_available = COMPUTE_BUDGET - (time_started - time.time())
β
time_available = COMPUTE_BUDGET - (time.time() - time_started)
0.9+ Baseline Solution for Part 1 of Challenge
Over 2 years agoThanks for publishing your solution!
Do you know how much βpseudolabel remaining datasetβ gives in terms of accuracy? (a boost)
I didnβt use it.
Experiments with βunlabelledβ data
Over 2 years agoIβve checked it locally.
Using all 10K images is better than my 3K choosing by 0.006. Maybe I can take some of it by changing purchasing algorithm. But still I feel I need to tune my model.
Experiments with βunlabelledβ data
Over 2 years agoI wrote scores from the leaderboard. I canβt check 10K thereβ¦
Local scores are a little bit higher than LB, but correlated with LB.
Yeah maybe Iβll check it locally.
Experiments with βunlabelledβ data
Over 2 years agoHere are just my results. I used the same model, but different purchase modes.
- Train with initial 5000 images only: LB 0.869
- Add 3000 random images from unlabelled dataset: 0.881
- βsmartβ purchasing (at least non random): 0.888
So we see, that using some βsmartβ purchasing is helpful, but not so many, maybe ~0.01.
Probably tuning models would be more helpful to push further.
First round doesn't matter?
Over 2 years agoIf I understood correctly, then the first round means a little and is preliminary. The second round is decisive, right?
Size of Datasets
Over 2 years agoAhh⦠I see so AICrowd runs the whole pipeline twice, and I can see logs only from the debug version.
Great, thanks!
Size of Datasets
Over 2 years agoHello!
During submission sizes of datasets are only 100 (both training dataset and unlabelled dataset).
Probably it is the debug version.
Is it intentionally?
Potential loop hole in purchasing phase
Over 2 years agoI think local evaluation can be modified somehow.
Maybe in ZEWDPCProtectedDataset class, that it doesnβt give you the label in a sample.
Allowance of Pre-trained Model
Over 2 years agoSorry, whatβs the right way to use pre-trained model?
Iβve tried βmodels.resnet18(pretrained=True)β but it has failed with
urllib.error.URLError: <urlopen error [Errno 99] Cannot assign requested address>
π Share your solutions! π
Over 2 years agoHello, I want to share my solution.
The competition was very interesting and unusual. And it was my first competition on AI crowd platform and guides/pages/discussions were very helpful for me. So thanks to organizers!!!
Actually my solution is very similar to xiaozhou_wangβs.
I have two strategies. First strategy is based on the idea to collect samples with βhardβ classes (it went from Round 1). Suppose we have a trained model and we know F1-measure for all six classes from validation. Let us sum class predictions with weights equal to 1 - f1_validataion. And then choose samples with maximum of weighted predictions.
The second strategy is to collect samples with higher uncertainty. I consider the prediction 0.5 is the most uncertain, so I just sum the absolute value of 0.5 β over all classes.
I also considered the third strategy from hosts: βmatch labels to target distributionβ, but it was worse than without it. PS. to organizers β I have this code in my solution since I exprimented, but take very little samples by it and I think it doesnβt matter for score.
I tried several ratios of first strategies, but I didnβt see an obvious advantage of one of them. So finally I used both strategies with the equal budget.
I saw the idea of βActive Learningβ in one of papers and decided to make several iterations (letβs say, L).
The problem was to calculate the number L of iterations. My way is not so clever as xiaozhou_wangβs. I noticed that ~300 samples are enough for one iteration. Even more, in my experiments sometimes more iterations worsened a result. I looked at the submissions table to estimate training time and inference time. So I came to the formula (I have Pretraining Phase, so the first iteration doesnβt need training)
For training I used efficientnet_b3, 5 epochs with
and the following augmentations