| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| RAGQA Leaderboard (test) | SFT+RL(Rhybrid) | AVG Score85 | 29 | 1mo ago | |
| DeepSearch Average | ActiveContext | SR57.1 | 23 | 5d ago | |
| DeepSearch Bamboogle | Synapse | Success Rate (SR)72 | 23 | 5d ago | |
| DeepSearch Musique | Synapse | SR46 | 23 | 5d ago | |
| DeepSearch 2wiki | Training-Free GRPO | Success Rate (SR)68 | 23 | 5d ago | |
| DeepSearch HotpotQA | ActiveContext | Success Rate56 | 23 | 5d ago | |
| DeepSearch PopQA | ActiveContext | Success Rate64 | 23 | 5d ago | |
| DeepSearch TriviaQA | ActiveContext | Success Rate (SR)78 | 23 | 5d ago | |
| DeepSearch NQ | Synapse | SR86 | 23 | 5d ago | |
| NQ | PoisonedRAG | ATR100 | 16 | 20d ago | |
| MS-MARCO | PoisonedRAG | ATR98 | 16 | 20d ago | |
| HotpotQA | PoisonedRAG | ATR100 | 16 | 20d ago |