| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Honesty Evaluation | FActScore v1.0 | Score47.3 | 20 | |
| Claim-level Uncertainty Quantification | FactScore English (test) | ROC-AUC71 | 20 | |
| Fact-checking of atomic claims | FactScore English | PR-AUC0.34 | 20 | |
| Factual Text Generation | FactScore | AURC0.7345 | 14 | |
| Factuality Generation | FActScore (test) | Number of Facts20.4 | 12 | |
| Factuality Evaluation | FactScore (unlabeled) | US (%)76.4 | 10 | |
| Factuality Evaluation | FactScore (labeled) | LS Score (%)64.8 | 10 | |
| Long-form text generation | FactScore | Response Completeness100 | 9 | |
| Long-form Factuality Verification | FactScore | Precision@165.41 | 7 | |
| Consistency Assessment of Generated Reference Points | FactScore LLM-based evaluation | Score86.36 | 6 | |
| Factuality Evaluation | FActScore | Pairwise Score69.3 | 3 |