Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

HoVer

Benchmarks

Task NameDataset NameSOTA ResultTrend
Claim VerificationHoVer (test)
Accuracy73.1
31
Fact-checkingHOVER 4-hop (test)
Macro F166.23
16
Fact-checkingHOVER 3-hop (test)
Macro F166.42
16
Fact-checkingHOVER 2-hop (test)
Macro F175.13
16
Multi-hop Faithfulness Hallucination DetectionHoVer Refined
Macro F182.9
14
Fact-checkingHOVER
Macro F1 (2-hop)71.82
12
Claim VerificationHOVER 4-hop
Accuracy73.62
12
Claim VerificationHOVER 3-hop
Accuracy75.16
12
Claim VerificationHOVER 2-hop
Accuracy76.69
12
Fact VerificationHOVER (test)
AUROC56.6
8
Multi-hop Fact VerificationHoVer 4-Hop
Macro-F163
7
Multi-hop Fact VerificationHoVer 3-Hop
Macro F158
7
Multi-hop Fact VerificationHoVer 2-Hop
Macro F171
7
RetrievalHoVer
Recall@50.768
7
Claim VerificationHoVer
Accuracy71
6
Multi-hop fact verificationHoVer few-shot
Recall56
4
Fact VerificationHOVER
AUROC0.589
4
Claim VerificationHoVer (dev)
Accuracy74.1
4
Showing 18 of 18 rows