Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Hallucination Detection on MHaluBench Image-to-Text (Claim-level)

86.54Hallucinatory Precision

GPT-based Self-Check

82.3883.4684.5485.62Jun 16, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.06
86.5485.1385.8369.0571.4870.2480.877.878.378.04
2025.06
84.9189.5287.1584.9189.5287.1585.3986.0780.2883.07
2025.06
84.7880.0782.3561.6469.0165.1276.5673.2174.5473.73
2025.06
84.4472.4477.9871.0883.5476.877.4177.7677.9977.39
2025.06
84.2466.7574.4867.3584.67574.7475.875.6874.74
2025.06
83.1742.1555.9555.6489.4868.6163.3469.4165.8262.28
2025.06
82.5485.2983.8981.0877.7479.3881.9181.8181.5281.63