A RAG QA system takes a question Q as input and outputs an answer A; the answer is generated by LLMs according to information retrieved from external sources, or directly from the knowledge internalized in the model. The answer should provide useful information to answer the question, without adding any hallucination or harmful content such as profanity.
Task #3: End-to-End RAG.
Similar to Task #2, Task #3 also provides both web search results and mock APIs as sources for retrieval. However, it offers 50 web pages as candidates, instead of just 5. The larger set of web pages is more likely to provide the necessary information to answer the question but is also more likely to include irrelevant data. As such, Task #3 further evaluates how the RAG system ranks a larger number of retrieval results.
To download the data, please see: https://www.aicrowd.com/challenges/meta-comprehensive-rag-benchmark-kdd-cup-2024/problems/end-to-end-retrieval-augmented-generation/dataset_files.
To know more about the CRAG challenge, please see: https://www.aicrowd.com/challenges/meta-comprehensive-rag-benchmark-kdd-cup-2024.