Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Faithful Reasoning on SPD-Faith Bench Multi-Difference 1.0 (test)
Loading...
87.7
CR
GLM-4.5V
28.732
44.041
59.35
74.659
Feb 8, 2026
CR
DRF
Updated 4d ago
Evaluation Results
Method
Method
Links
CR
DRF
GLM-4.5V
Access Type=Open-Source
2026.02
87.7
58.3
Qwen3-VL-235B-A22B
Access Type=Open-Source
2026.02
82.3
44.8
Gemini-2.5-Pro
Access Type=Close-Source
2026.02
78.2
40.4
GPT-4o
Access Type=Close-Source
2026.02
74.8
39.3
Claude-4.5-Haiku
Access Type=Close-Source
2026.02
65.4
38.2
Qwen3-VL-32B
Access Type=Open-Source
2026.02
64.6
37.6
InternVL2.5-38B
Access Type=Open-Source
2026.02
59.6
35.8
Qwen2.5-VL-72B
Access Type=Open-Source
2026.02
56.8
31.7
DeepSeek-VL2
Access Type=Open-Source
2026.02
53.9
25.6
Qwen2.5-VL-7B
Access Type=Open-Source
2026.02
38.2
29.5
MiniCPM-V-2.6
Access Type=Open-Source
2026.02
32.3
23.4
InternVL2.5-8B
Access Type=Open-Source
2026.02
31
27.2
Feedback
Search any
task
Search any
task