Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

HallusionBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Hallucination EvaluationHallusionBench
Accuracy82.02
153
Visual Hallucination EvaluationHallusionBench
Accuracy76.6
120
Multimodal ReasoningHallusionBench
Accuracy0.7293
42
Hallucination and Visual Reasoning EvaluationHallusionBench
Accuracy (aACC)81.8
40
Hallucination AssessmentHallusionBench
Answer Accuracy (aAcc)71.6
39
Multimodal UnderstandingHallusionBench
Accuracy77.2
37
Hallucination RobustnessHallusionBench
Score57.8
32
HallucinationHallusionBench
HallusionBench Score69.2
26
Multimodal Hallucination EvaluationHallusionBench
Hallucination Score70.7
22
Hallucination MitigationHallusionBench visual-dependent setting (test)
qAcc (Question Accuracy)33.21
21
Multimodal Hallucination AssessmentHallusionBench
Accuracy70
21
Hallucination DetectionHallusionBench
Hallusion Score67.05
20
General VQAHallusionBench
Accuracy73.48
20
HallucinationHallusionBench
Pass@174
16
Hallucination DiagnosisHallusionBench
LI Score96
15
Visual PerceptionHallusionBench
Accuracy71.08
15
Visual ReasoningHallusionBench
Accuracy68.19
15
PerceptionHallusionBench
Score59.5
15
Hallucination EvaluationHallusionBench 2024
Score52.2
13
Visual Illusion and Hallucination EvaluationHallusionBench (HallB)
HallB Score41.7
13
Hallucination and Visual Illusion AssessmentHallusionBench
Accuracy22.51
12
Hallucination EvaluationHallusionBench GPT4-assisted (All)
Accuracy (All)49.94
11
Hallucination EvaluationHallusionBench 1.0 (test)
fACC22.2
10
Discriminative Hallucination DetectionHallusionBench
Accuracy73
10
Visual Hallucination EvaluationHallusionBench visual questions
Accuracy65.8
10
Showing 25 of 40 rows