Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

HoVer

Benchmarks

Task NameDataset NameSOTA ResultTrend
Fact-checkingHOVER 4-hop (test)
Macro F166.23
16
Fact-checkingHOVER 3-hop (test)
Macro F166.42
16
Fact-checkingHOVER 2-hop (test)
Macro F175.13
16
Multi-hop Faithfulness Hallucination DetectionHoVer Refined
Macro F182.9
14
Fact-checkingHOVER
Macro F1 (2-hop)71.82
12
Claim VerificationHOVER 4-hop
Accuracy73.62
12
Claim VerificationHOVER 3-hop
Accuracy75.16
12
Claim VerificationHOVER 2-hop
Accuracy76.69
12
Claim VerificationHoVer (test)
Accuracy73.1
12
Fact VerificationHOVER (test)
AUROC56.6
8
Multi-hop Fact VerificationHoVer 4-Hop
Macro-F163
7
Multi-hop Fact VerificationHoVer 3-Hop
Macro F158
7
Multi-hop Fact VerificationHoVer 2-Hop
Macro F171
7
RetrievalHoVer
Recall@50.768
7
Claim VerificationHoVer
Accuracy71
6
Fact VerificationHOVER
AUROC0.589
4
Claim VerificationHoVer (dev)
Accuracy74.1
4
Showing 17 of 17 rows