Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Fine-grained Multi-Image Understanding on VLM2-Bench

95.06Matching Score

Human-Level

11.059232.867154.67576.4829Apr 24, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
95.0698.1194.2396.0291.2992.8797.0891.1794.48
2026.04
57.1467.1256.6781.945857.5656263.17
2026.04
55.647.0352.574.1546077.55158.97
2026.04
48.8556.1470.2884.727677.5839473.81
2026.04
47.4963.0361.472.25557.5715159.83
2026.04
42.1150.2261.3972.3658.564.17827963.72
2026.04
40.9343.8350.8363.3334.563.3370.54751.78
2026.04
40.4546.5862.575.5649.562.577.55158.2
2026.04
35.9143.3841.7271.3947.559.76806956.08
2026.04
30.530.5951.4843.3352.559.6759.561.2548.6
2026.04
27.0348.8666.398070.573.33808866.76
2026.04
23.9429.6864.1761.6739.555635549
2026.04
21.2426.5355.2353.3346.56051.55245.79
2026.04
18.5312.7962.4754.7228.566.91622541.37
2026.04
18.0719.1861.8468.0837.567.92724748.95
2026.04
17.3718.2662.9749.173158.06632941.1
2026.04
16.613.756.1747.2227.546.67623738.36
2026.04
14.2912.9849.4746.532941.56582534.6