| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Factuality Checking | LLM-AggreFact (test) | CNN Score72.5 | 16 | |
| Faithfulness Hallucination Detection | LLM-AggreFact Refined | Agg-CNN86.8 | 14 | |
| Factuality Evaluation | LLM-AggreFact (test) | CNN Score69.9 | 13 | |
| Fact-Checking | LLM-AGGREFACT (test) | Cost ($)0.2 | 10 |