Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Alignment on HalluBench
Loading...
63
Accuracy
EVE (Ours-8B-iter4)
48.44
52.22
56
59.78
Apr 20, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
EVE (Ours-8B-iter4)
Model Category=Pseudo-...
2026.04
63
Jigsaw-R1-8B
Model Category=Templat...
2026.04
62.1
MM-Zero-8B-iter3
Model Category=Pseudo-...
2026.04
61.7
VisPlay-8B-iter3
Model Category=Pseudo-...
2026.04
61.6
Qwen3-VL-8B-Instruct
Model Category=Open-So...
2026.04
61.2
GPT-4o-20240513
Model Category=Closed-...
2026.04
55
Qwen-ViPER-7B
Model Category=Pseudo-...
2026.04
54.4
Spatial-SSRL-7B
Model Category=Templat...
2026.04
53.2
InternVL3-9B
Model Category=Open-So...
2026.04
51.2
LLaVA-OneVision-72B
Model Category=Open-So...
2026.04
49
Feedback
Search any
task
Search any
task