Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Visual Reasoning on REASONMAP Short questions

0.5998Weighted Accuracy

GPT-5

0.0328960.1800730.327250.474427Oct 2, 2025
Updated 3d ago

Evaluation Results

MethodLinks
2025.10
0.59980.0948
2025.10
0.41150.0684
2025.10
0.3420.0525
2025.10
0.31510.0621
2025.10
0.29510.06
2025.10
0.28820.0588
0.27170.0568
0.26650.0509
2025.10
0.26220.0552
2025.10
0.26220.0557
0.16490.0388
2025.10
0.13630.0409
0.13280.0401
0.12760.033
0.08680.0275
0.05470.0244