Loading
0 Follower
1 Following
saidinesh_pola
saidinesh pola

Location

Bangalore, IN

Badges

5
4
2

Connect

Activity

Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Mon
Wed
Fri

Ratings Progression

Loading...

Challenge Categories

Loading...

Challenges Entered

Improve RAG with Real-World Benchmarks

Latest submissions

See All
failed 266505
graded 266503
graded 266497

Latest submissions

See All
graded 251762
graded 251753
graded 251702

Latest submissions

See All
graded 252068
graded 250634
graded 250633

Multi-Agent Dynamics & Mixed-Motive Cooperation

Latest submissions

See All
graded 243898
failed 242953
failed 242945

Latest submissions

See All
submitted 246741
submitted 246661
submitted 246539

Latest submissions

No submissions made in this challenge.

Latest submissions

No submissions made in this challenge.

Small Object Detection and Classification

Latest submissions

See All
graded 240507
graded 240506
graded 240490

Understand semantic segmentation and monocular depth estimation from downward-facing drone images

Latest submissions

See All
submitted 218884
graded 218883
submitted 218875

Latest submissions

See All
graded 210413
failed 210389
graded 210283

A benchmark for image-based food recognition

Latest submissions

See All
graded 181873
graded 181872
graded 181870

Using AI For Building’s Energy Management

Latest submissions

See All
graded 205123
failed 204464
failed 204102

What data should you label to get the most value for your money?

Latest submissions

No submissions made in this challenge.

Interactive embodied agents for Human-AI collaboration

Latest submissions

No submissions made in this challenge.

Latest submissions

See All
graded 186830
graded 186829
graded 186828

Improving the HTR output of Greek papyri and Byzantine manuscripts

Latest submissions

No submissions made in this challenge.

Machine Learning for detection of early onset of Alzheimers

Latest submissions

No submissions made in this challenge.

5 Puzzles 21 Days. Can you solve it all?

Latest submissions

See All
failed 164556
failed 164555

5 Puzzles 21 Days. Can you solve it all?

Latest submissions

See All
graded 157708

A benchmark for image-based food recognition

Latest submissions

No submissions made in this challenge.

5 Puzzles, 3 Weeks. Can you solve them all? πŸ˜‰

Latest submissions

No submissions made in this challenge.

Project 2: Road extraction from satellite images

Latest submissions

No submissions made in this challenge.

Project 2: build our own text classifier system, and test its performance.

Latest submissions

No submissions made in this challenge.

5 PROBLEMS 3 WEEKS. CAN YOU SOLVE THEM ALL?

Latest submissions

No submissions made in this challenge.

Predict if users will skip or listen to the music they're streamed

Latest submissions

No submissions made in this challenge.

5 puzzles and 1 week to solve them!

Latest submissions

No submissions made in this challenge.

Estimate depth in aerial images from monocular downward-facing drone

Latest submissions

See All
submitted 218884
graded 218883
graded 218801

Perform semantic segmentation on aerial images from monocular downward-facing drone

Latest submissions

See All
submitted 218875
graded 218874
submitted 218871

Latest submissions

See All
graded 252068
graded 250634
graded 250621

Latest submissions

See All
graded 250633
graded 250629
failed 250626

Testing RAG Systems with Limited Web Pages

Latest submissions

See All
graded 265880
failed 265828
graded 265758
Participant Rating
Participant Rating
gaurav_singhal 0

Meta Comprehensive RAG Benchmark: KDD Cup 2-9d1937

Client Failed

4 months ago

You can try it, but I just pinned it to the original version based on the russwest404 suggestion.

Client Failed

4 months ago

Yeah, its resolved after removing langchain and pinning the vllm version to 0.4.2

Client Failed

4 months ago

Thanks, I read this in Reddit as well. I will try without langchain.

Client Failed

4 months ago

Error in create_agent: Timed out after 601 seconds waiting for clients. 1/4 clients joined.
Any idea on why it is coming? Is it related my code or from aicrowd server failure?

Can we have multi GPU submission example?

5 months ago

Is there a baseline for multi-GPU submission via DDP or accelerate?

Gitlab Dev Ops issue

5 months ago

The Gitlab issue page is not auto fetching by default. We have to refresh every time to see the new Evaluation Logs changes in the issue page

Couldn't connect to gitlab

5 months ago

Nevermind, Its working after setting up ssh key again

Couldn't connect to gitlab

5 months ago

git push
kex_exchange_identification: Connection closed by remote host
Connection closed by 52.72.8.30 port 22
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

Can we assume the same websites in both the test and training datasets?

6 months ago

Can we assume these same websites in both the test and training datasets?

Meta KDD Cup 24 - CRAG - Retrieval Summarization

Are these evaluation qa values present in qa.json correct?

6 months ago

The evaluation dataset questions consists of stock prices, but none of the answers are accurate as of Feb 16th, which is the last date given in the dataset. I used Nasdaq and MarketWatch to test the prices on that day, but none of them matched.
you can check the closing price on https://www.nasdaq.com/market-activity/stocks/tfx
For example, the closing price on feb16th is 251.07$ but the answer is given as 249.07$
β€œinteraction_id”: 0,
β€œquery”: β€œwhat’s the current stock price of Teleflex Incorporated Common Stock in USD?”,
β€œanswer”: β€œ249.07 USD”

Generative Interior Design Challenge 2024

Top teams solutions

6 months ago

:clap:Nice use of an external dataset and converting it to this challenge format. I did explore different variations at the time of inference using prompt engineering. I used a better segmentation model (swin-base-IN21k) and modified the control items with pillars as well for better geometry along with different prompt engineering techniques. Even though baseline gave me a better score, it is really inconsistent. Finally, I submitted a realistic vision model from Comfort UI, which gave stable and consistent results, and based on the human evaluations I did expect some kind of randomness in the leaderboard. I would like to express my gratitude to the organizers of this challenge. The challenge is new and exciting, but because there are only 40 images in the test dataset, the human evaluations are much worse and inconsistent. It was really fun exploring stable diffusion models and their adapters. When I have more processing power, I want to work on this in the near future.

πŸ† Generative Interior Design Challenge: Top 3 Teams

6 months ago

@lavanya_nemani you can not judge the test dataset’s performance based on 3 public images. As the scores are really close, the preference of annotators can change a little bit.

πŸ† Generative Interior Design Challenge: Top 3 Teams

6 months ago

Congratulations to the winners! Also post this in discord, we have no idea that this post exists.

Build fail -- Ephemeral storage issue

6 months ago

@lavanya_nemani The maximum size your submission can have is 10GB. So keep only maximum of 10GB in submission tag. Check your models folder and delete unused files. The repo can have more than 10Gb and you don’t need to delete them.

Submission stuck at intermediate state

6 months ago

My submission with Id #251530 evaluation was complete, but it got stuck and did not proceed to the human evaluation phase.

Commonsense Persona-Grounded Dialogue Chall-459c12

Service Announcement: Delays in GPU Node Provisioning

7 months ago

Can we try new submissions now?

Task 1: Commonsense Dialogue Response Generation

Updates to Task 1 Metrics

7 months ago

Only problem is that the leader board is dominated by ChatGPT’s PE track, but the peacock paper’s human evaluation is not the same as ChatGPT/ChatGPT4. It is as if their experimentation can be made false simply by using prompt engineering. Could someone please clarify this?

In the human evaluation, we find that facts generated by COMET-BART receive a high acceptance rate by crowdworkers for plausibility, slightly beating fewshot GPT-3. We also find that zero-shot GPT-3.5 model, although more advanced than the GPT-3 baseline model, scores, on average, ∼15.3% and
∼9.3% lower than COMET-BART in terms of automatic metrics and human acceptance, respectively.

Is anyone encountering SSL issue with image build caching API? I don't know if there is something I can do about this?

7 months ago

The same thing is happening for me in task 2 as well. The evaluation gets initialized, but it says the time-out error. @dipam Can you make the logs easier to understand?

AI enthusiastic and trying to learn the new technologies. Glad to collaborate with others on ML challenges.

Notebooks

Create Notebook