| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Hallucination Evaluation | HallusionBench | Average Score93.1 | 93 | |
| Hallucination and Visual Reasoning Evaluation | HallusionBench | Score59.2 | 37 | |
| Hallucination Robustness | HallusionBench | Score57.8 | 32 | |
| Hallucination Assessment | HallusionBench | Question Accuracy (qAcc)49 | 30 | |
| Visual Hallucination Evaluation | HallusionBench | Accuracy (Q)31.42 | 19 | |
| Multimodal Reasoning | HallusionBench | Accuracy0.709 | 17 | |
| Hallucination | HallusionBench | Pass@174 | 16 | |
| Multimodal Hallucination Evaluation | HallusionBench | Hallucination Score70.7 | 14 | |
| Hallucination Evaluation | HallusionBench 2024 | Score52.2 | 13 | |
| Visual Illusion and Hallucination Evaluation | HallusionBench (HallB) | HallB Score41.7 | 13 | |
| Hallucination Evaluation | HallusionBench GPT4-assisted (All) | Accuracy (All)49.94 | 11 | |
| Visual Hallucination Evaluation | HallusionBench visual questions | Accuracy65.8 | 10 | |
| Vision-Language Reasoning | HallusionBench (test) | Simple Accuracy53.31 | 7 | |
| General visual question answering | HallusionBench | Pass@163.7 | 7 | |
| Hallucination control | HallusionBench | General Score60.5 | 6 | |
| General VQA | HallusionBench | Accuracy73.48 | 5 | |
| Hallucination Analysis | HallusionBench | fACC18.7 | 4 | |
| Hallucination Evaluation | HallusionBench (test) | Question Pair Accuracy17.8 | 4 | |
| Visual Question Answering | HallusionBench HBI (all) | Score45.21 | 4 | |
| Paired-prompt evaluation | HallusionBench | Simple Accuracy52.89 | 2 | |
| Visual Question Answering | HallusionBench | Simple Accuracy51.31 | 2 |