| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Faithfulness evaluation | WikiBio | AUC π-Soft-NS0.438 | 27 | |
| Data-to-Text Generation | WikiBio (test) | BLEU45.14 | 17 | |
| Knowledge modification | WikiBio | Edit Success Rate100 | 15 | |
| Hallucination self-detection | WikiBio GPT-4o | Accuracy85 | 12 | |
| Text Generation | WikiBIO | BLEU9.68 | 11 | |
| Sentence-Level Confidence Prediction | WikiBio | AUROC68.6 | 10 | |
| Table-to-text generation | WIKIBIO (test) | BLEU-444.89 | 10 | |
| Hallucination Detection | WikiBio GPT-3.5-Turbo-Instruct (test) | AUC-PR (Nonfactual)92.5 | 8 | |
| Data-to-text generation | WikiBio 22 | BLEU47.17 | 7 | |
| Knowledge Editing | WikiBio (test) | RwA84.33 | 6 | |
| Table-to-Text Generation | WikiBio (val) | Fluency99.6 | 4 | |
| Hallucination Detection | WikiBio | Metric- | 0 | |
| Faithfulness evaluation | WikiBio (test) | AUC π-Soft-NS- | 0 |