Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Visual Understanding on R-Bench (test)

65.29MCQ (low)

Robust-R1 (SFT and RL)

32.249240.827149.40557.9829Dec 19, 2025
Updated 3d ago

Evaluation Results

MethodLinks
2025.12
65.2963.9160.9749.1449.0949.840.6837.8134.8450.17
2025.12
64.1160.2257.3248.7248.5449.0437.7837.0433.348.45
2025.12
62.3560.2459.1449.8245.3951.0836.6730.4128.5147.06
2025.12
61.7660.8756.148.0448.3650.1240.838.5835.1848.86
2025.12
58.2357.7650.648.6546.344.1940.4837.4634.846.49
2025.12
47.0546.5840.2445.0343.3947.4322.922.1919.8337.18
2025.12
46.4742.2340.2446.8739.9444.6121.1121.9519.3735.86
2025.12
33.5226.0830.4826.0722.1224.430.680.650.6718.3