| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MAQA | P(True) | Hamming Distance0.04 | 52 | 21d ago | |
| MAQA-ΔK−1 | P(True) | KL Divergence-0.149 | 48 | 3mo ago | |
| QUEST | SPADER | Precision23.3 | 16 | 1d ago | |
| WebQSP | SPADER | Precision50.3 | 16 | 1d ago | |
| Mintaka | SPADER | Precision64.2 | 16 | 1d ago | |
| JEC-QA | InternVL3-2B | Score64.2 | 4 | 2mo ago | |
| Mmlu Multi-Answer | InternVL3-2B | Overall Score61.45 | 4 | 2mo ago | |
| AMBIGQA (test) | recall-then-verify | F1 (All Questions)46.2 | 3 | 3mo ago | |
| AMBIGQA (dev) | recall-then-verify | F1 (all questions)52.1 | 3 | 3mo ago | |
| WEBQSP (test) | recall-then-verify | F1 (All Questions)0.558 | 2 | 3mo ago | |
| WEBQSP (dev) | recall-then-verify | F1 (All Questions)55.4 | 2 | 3mo ago |