| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Hallucination Evaluation | HallusionBench | Accuracy82.02 | 153 | |
| Visual Hallucination Evaluation | HallusionBench | Accuracy76.6 | 120 | |
| Multimodal Reasoning | HallusionBench | Accuracy0.7293 | 42 | |
| Hallucination and Visual Reasoning Evaluation | HallusionBench | Accuracy (aACC)81.8 | 40 | |
| Hallucination Assessment | HallusionBench | Answer Accuracy (aAcc)71.6 | 39 | |
| Multimodal Understanding | HallusionBench | Accuracy77.2 | 37 | |
| Hallucination Robustness | HallusionBench | Score57.8 | 32 | |
| Hallucination | HallusionBench | HallusionBench Score69.2 | 26 | |
| Multimodal Hallucination Evaluation | HallusionBench | Hallucination Score70.7 | 22 | |
| Hallucination Mitigation | HallusionBench visual-dependent setting (test) | qAcc (Question Accuracy)33.21 | 21 | |
| Multimodal Hallucination Assessment | HallusionBench | Accuracy70 | 21 | |
| Hallucination Detection | HallusionBench | Hallusion Score67.05 | 20 | |
| General VQA | HallusionBench | Accuracy73.48 | 20 | |
| Hallucination | HallusionBench | Pass@174 | 16 | |
| Hallucination Diagnosis | HallusionBench | LI Score96 | 15 | |
| Visual Perception | HallusionBench | Accuracy71.08 | 15 | |
| Visual Reasoning | HallusionBench | Accuracy68.19 | 15 | |
| Perception | HallusionBench | Score59.5 | 15 | |
| Hallucination Evaluation | HallusionBench 2024 | Score52.2 | 13 | |
| Visual Illusion and Hallucination Evaluation | HallusionBench (HallB) | HallB Score41.7 | 13 | |
| Hallucination and Visual Illusion Assessment | HallusionBench | Accuracy22.51 | 12 | |
| Hallucination Evaluation | HallusionBench GPT4-assisted (All) | Accuracy (All)49.94 | 11 | |
| Hallucination Evaluation | HallusionBench 1.0 (test) | fACC22.2 | 10 | |
| Discriminative Hallucination Detection | HallusionBench | Accuracy73 | 10 | |
| Visual Hallucination Evaluation | HallusionBench visual questions | Accuracy65.8 | 10 |