| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Fact Verification | FEVER | Accuracy53.9 | 67 | |
| Fact Verification | FEVER (dev) | Label Accuracy82.1 | 57 | |
| Fact Verification | FEVER (test) | LA Score79.47 | 32 | |
| Fact Verification | FEVER 1.0 (dev) | Label Accuracy89.07 | 23 | |
| Fact Extraction and Verification | FEVER (test) | Label Accuracy (LA)75.96 | 18 | |
| Explanation Evaluation | FEVER (test) | Sufficiency9.72 | 16 | |
| Fact Verification | FEVER-Symmetric | Precision88 | 16 | |
| Fact-checking | FEVER | F1 Macro94.3 | 14 | |
| Fact Verification | FEVER 1.0 (test) | Label Accuracy74.07 | 14 | |
| Classification | FEVER Symmetric v2 1.0 | Accuracy69.1 | 13 | |
| Classification | FEVER v1 (ID) | Accuracy87.5 | 13 | |
| Fact Verification | FEVER-S | Accuracy54 | 12 | |
| Fact Verification | FEVER | Accuracy61.4 | 12 | |
| Fact-verification | FEVER | Accuracy73.73 | 11 | |
| Sentence-Level Confidence Prediction | FEVER | AUROC0.7 | 10 | |
| global fact consistency verification | FEVER | Precision99.5 | 10 | |
| Fact checking | FEVER v1.0 (dev) | Acc55.1 | 10 | |
| Claim Verification | FEVER (test) | Accuracy72.5 | 10 | |
| Fact Verification | FEVER | Accuracy78 | 9 | |
| Neural Caching | FEVER | Online Accuracy (AUC)75.3 | 9 | |
| Fact Verification | Symmetric FEVER 1.0 (test) | Accuracy85.88 | 9 | |
| Fact Extraction and Verification | FEVER (dev) | Label Accuracy (LA)76.3 | 9 | |
| Information Retrieval | FEVER (test) | NDCG@100.796 | 9 | |
| Fact Verification | FEVER S R | Precision95.2 | 8 | |
| Fact Extraction and Verification | FEVER leaderboard March 2019 (test) | Evidence F177.7 | 8 |