Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Hallucination Evaluation on HalBench

76.7Score

Qwen3-VL-8B-Thinking + DiG

49.03656.21863.470.582Dec 14, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
76.7
2025.12
73.8
2025.12
73.3
2025.12
70.1
2025.12
57.1
2025.12
55.2
2025.12
50.1