Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

FactScore

Benchmarks

Task NameDataset NameSOTA ResultTrend
Honesty EvaluationFActScore v1.0
Score47.3
20
Claim-level Uncertainty QuantificationFactScore English (test)
ROC-AUC71
20
Fact-checking of atomic claimsFactScore English
PR-AUC0.34
20
Factual Text GenerationFactScore
AURC0.7345
14
Factuality GenerationFActScore (test)
Number of Facts20.4
12
Factuality EvaluationFactScore (unlabeled)
US (%)76.4
10
Factuality EvaluationFactScore (labeled)
LS Score (%)64.8
10
Long-form text generationFactScore
Response Completeness100
9
Long-form Factuality VerificationFactScore
Precision@165.41
7
Consistency Assessment of Generated Reference PointsFactScore LLM-based evaluation
Score86.36
6
Factuality EvaluationFActScore
Pairwise Score69.3
3
Showing 11 of 11 rows