Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MHaluBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
element-level text-to-image alignment evaluationMHaluBench
SRCC72.7
17
Hallucination DetectionMHaluBench Image-to-Text Segment-level
Hallucinatory Precision90.44
7
Hallucination DetectionMHaluBench Image-to-Text (Claim-level)
Hallucinatory Precision86.54
7
Visual ReasoningMHaluBench
SRCC70.6
5
Showing 4 of 4 rows