Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Hallucination Reasoning on SugarCrepe
Loading...
86.4
Accuracy
Argos
79.64
81.395
83.15
84.905
Dec 3, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Argos
2025.12
86.4
Qwen2.5VL-7B
2025.12
85.2
Video-R1
Training=SFT
2025.12
83.3
Qwen2.5VL-7B
Chain-of-Thought (CoT)...
2025.12
83.2
Video-R1
Training=RL
2025.12
79.9
Feedback
Search any
task
Search any
task