| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| TabFact | Human | Accuracy92.1 | 73 | 4d ago | |
| FEVER | COT-SC + VE | Accuracy53.9 | 67 | 4d ago | |
| FEVER (dev) | ROBERTaBase Flexible (IE1) + MSPP (REk) | Label Accuracy82.1 | 57 | 4d ago | |
| FEVER (test) | ProoFVer-SB | LA Score79.47 | 32 | 4d ago | |
| RAWFC | KG-CRAFT | Precision81.63 | 30 | 4d ago | |
| MINE | Hyper-KGGen+ | Accuracy84.73 | 28 | 4d ago | |
| FEVER 1.0 (dev) | ProoFVer | Label Accuracy89.07 | 23 | 4d ago | |
| LIAR | ERM | F1 Score68.6 | 18 | 4d ago | |
| FactKG | SeleCom | Accuracy67.44 | 17 | 4d ago | |
| FEVER-Symmetric | RoBERTa-large | Precision88 | 16 | 4d ago | |
| FACT | Accuracy99.44 | 15 | 4d ago | ||
| FEVER 1.0 (test) | KGAT | Label Accuracy74.07 | 14 | 4d ago | |
| VitaminC | CPO | Accuracy (%)54 | 12 | 4d ago | |
| FEVER-S | CPO | Accuracy54 | 12 | 4d ago | |
| FEVER | ToT | Accuracy61.4 | 12 | 4d ago | |
| FEVER | CDKC | Accuracy73.73 | 11 | 4d ago | |
| InfoTabs (test) | MiniCPM-V-2.6 8B | Accuracy75.74 | 11 | 4d ago | |
| PubHealthTab OOD (test) | Qwen3-VL-8B-DISCO | Accuracy77.14 | 10 | 4d ago | |
| FEVER | Accuracy78 | 9 | 4d ago | ||
| FACTKG 1.0 (test) | LLME (KG-GPT) | Accuracy72.7 | 9 | 4d ago | |
| Symmetric FEVER 1.0 (test) | ProoFVer | Accuracy85.88 | 9 | 4d ago | |
| HOVER (test) | DiffuTruth | AUROC56.6 | 8 | 4d ago | |
| AVERITEC | DRAG_E | Accuracy49.1 | 8 | 4d ago | |
| Creak | TOG | Accuracy0.956 | 8 | 4d ago | |
| FEVER S R | RoBERTa-large | Precision95.2 | 8 | 4d ago |