| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Fact-checking | HOVER 4-hop (test) | Macro F166.23 | 16 | |
| Fact-checking | HOVER 3-hop (test) | Macro F166.42 | 16 | |
| Fact-checking | HOVER 2-hop (test) | Macro F175.13 | 16 | |
| Multi-hop Faithfulness Hallucination Detection | HoVer Refined | Macro F182.9 | 14 | |
| Fact-checking | HOVER | Macro F1 (2-hop)71.82 | 12 | |
| Claim Verification | HOVER 4-hop | Accuracy73.62 | 12 | |
| Claim Verification | HOVER 3-hop | Accuracy75.16 | 12 | |
| Claim Verification | HOVER 2-hop | Accuracy76.69 | 12 | |
| Claim Verification | HoVer (test) | Accuracy73.1 | 12 | |
| Fact Verification | HOVER (test) | AUROC56.6 | 8 | |
| Multi-hop Fact Verification | HoVer 4-Hop | Macro-F163 | 7 | |
| Multi-hop Fact Verification | HoVer 3-Hop | Macro F158 | 7 | |
| Multi-hop Fact Verification | HoVer 2-Hop | Macro F171 | 7 | |
| Retrieval | HoVer | Recall@50.768 | 7 | |
| Claim Verification | HoVer | Accuracy71 | 6 | |
| Fact Verification | HOVER | AUROC0.589 | 4 | |
| Claim Verification | HoVer (dev) | Accuracy74.1 | 4 |