Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Visual Reasoning on Out-of-Domain (OOD) Aggregate (HalluBench, MathVista, MathVerse, MathVision)

0.5531OOD Avg Accuracy

SaEI

0.4977720.5121360.52650.540864Dec 11, 2025
Updated 3d ago

Evaluation Results

MethodLinks
2025.12
0.5531
2025.12
0.5474
2025.12
0.547
2025.12
0.5368
2025.12
0.4999