| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Factuality Hallucination Evaluation | LongFact (test) | Response Score100 | 30 | |
| Factuality Hallucination | LongFact | Facts Score23.5 | 30 | |
| Factual Text Generation | LongFact Objects | AURC0.426 | 14 | |
| Long-form generation factuality and uncertainty estimation | LongFact (test) | Factuality Score91.5 | 14 | |
| Long-form Question Answering | LongFact | VeriScore F175.9 | 14 | |
| Hallucination Detection | LongFact-Aug (test) | AUC0.9404 | 4 |