Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AVeriTeC

Benchmarks

Task NameDataset NameSOTA ResultTrend
Claim VerificationAVeriTeC Retrieved (I) (dev)
Accuracy73.6
28
Claim VerificationAVeriTeC Retrieved (H) (dev)
Accuracy72.8
28
Claim VerificationAVeriTeC Golden (dev)
Accuracy83.4
28
Document-level Claim ExtractionAVeriTeC-DCE (test)
chrF26.4
11
Fact-checkingAVeriTeC (test)
Hu-METEOR (Q only)0.48
9
Faithfulness EvaluationAVeriTeC
FID0.3
8
Fact VerificationAVERITEC
Accuracy49.1
8
Justification Quality EvaluationAVeriTeC Retrieved (H) 50 correctly verified claims
MOS3.67
6
Fact-checkingAVeriTeC (dev)
Hu-METEOR (Q only)0.46
6
Document-level Claim Extraction (Sentence)AVeriTeC-DCE 1.0 (test)
SARI6.6
6
Sentence ExtractionAVeriTeC-DCE 1.0 (test)
P@147.8
6
Document-level Claim Extraction (Decontextualized Sentence)AVeriTeC-DCE 1.0 (test)
SARI6.7
5
Claim VerificationAVeriTeC (dev)
Supported F168
4
Automated Fact-CheckingAVeriTeC (test)
Precision54
4
Sentence extractionAVeriTeC-DCE (full)
IsCheckWorthy44
2
Showing 15 of 15 rows