Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multimodal Reasoning on SPD-Faith Bench Easy 1.0

5Contradiction Rate

GPT-4o

3.6212.93522.2531.565Feb 8, 2026
Updated 3d ago

Evaluation Results

MethodLinks
2026.02
5
2026.02
8
2026.02
10.5
2026.02
16
2026.02
16.5
2026.02
16.5
2026.02
19
2026.02
19.2
2026.02
23
2026.02
23
2026.02
32.5
2026.02
39.5