Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Discriminative Hallucination Detection on CRPE R
Loading...
70.7
Accuracy
Qwen2.5-VL + FINER-Tuning
50.94
56.07
61.2
66.33
Mar 18, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Qwen2.5-VL + FINER-Tuning
Size=7B
2026.03
70.7
Qwen2.5-VL
Size=7B
2026.03
69.9
InternVL-3.5 + FINER-Tuning
Size=14B
2026.03
69
InternVL-3.5 + FINER-Tuning
Size=8B
2026.03
68
InternVL-3.5
Size=8B
2026.03
67.7
InternVL-3.5
Size=14B
2026.03
67.2
LLaVA-1.6
Size=7B
2026.03
56.5
LLaVA-1.6 + FINER-Tuning
Size=7B
2026.03
56
OmniLMM + RLAIF-V
Size=12B
2026.03
52.2
OmniLMM
Size=12B
2026.03
51.7
Feedback
Search any
task
Search any
task