| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MS COCO image captioning (test) | Ours (MLP) | Precision88 | 27 | 1mo ago | |
| Olym-Math | Qwen3-0.6B | S_incor8.42 | 14 | 21d ago | |
| AIME 2025 | Qwen3-0.6B | S_incor6.46 | 14 | 21d ago | |
| AIME 2024 | Qwen3-0.6B | S_incor Score9.29 | 14 | 21d ago | |
| Math 500 | Qwen3-0.6B | S_incor9.86 | 14 | 21d ago | |
| RAGognize | RAGognizer | AUROC93.33 | 7 | 1mo ago | |
| RAGTruth QA | LettuceDetect-L | AUROC95.6 | 7 | 1mo ago | |
| HalLoc Caption | HalLocalizer | Object Precision66 | 7 | 3mo ago | |
| HalLoc Instruct | HalLocalizer | Object Precision94 | 7 | 3mo ago | |
| HalLoc VQA | HalLocalizer | Object Precision61 | 7 | 3mo ago | |
| AIME 2025 | TOKENHD-8B | AUROC89.47 | 5 | 21d ago | |
| AIME 2024 | TOKENHD-8B | AUROC87.39 | 5 | 21d ago | |
| HDM-Bench | RAGognizer | AUROC77.03 | 5 | 1mo ago |