Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Visual Reasoning on VLM2-Bench

95.06Mat

Human-Level

11.059232.867154.67576.4829Feb 28, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
95.0698.1196.0294.2391.2997.0892.8791.1710095.160
2026.02
57.2354.1487.5673.0678.59276.676646.7570.21-24.95
2026.02
57.1467.1281.9456.67586557.56244.2561.06-34.09
2026.02
55.647.0374.152.55477.5605143.557.24-37.91
2026.02
52.8851.5986.8169.6474.590755740.566.4328.73
2026.02
47.4963.0372.261.4557157.55147.7558.49-36.67
2026.02
43.2442.9266.3950.563662.555.833936.7548.91-46.24
2026.02
41.2426.5372.2267.65408566.6752.2550.2555.41-39.75
2026.02
40.9343.8363.3350.8334.570.563.334736.550.08-45.08
2026.02
40.4546.5875.5662.549.577.562.55136.555.79-39.37
2026.02
37.4539.2774.1780.6257.55090.54766.7560.36-34.8
2026.02
35.9143.3871.3941.7247.58059.76694554.82-40.34
2026.02
30.530.5943.3351.4852.559.559.6761.2545.2545.59-49.57
25255034.88255034.8725-32.73-61.44
2026.02
18.5312.7954.7262.4728.56266.91255945.65-49.51
2026.02
18.0719.1868.0861.8437.57267.924755.2549.76-45.4
2026.02
17.3718.2649.1762.97316358.06294340.87-54.31
2026.02
16.613.747.2256.1727.56246.673747.2539.35-55.81
2026.02
14.2912.9846.5349.47295841.56254537.1-58.06