Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-modal Reasoning on MMVet (test)

80.8Accuracy

GPT-4o

5.29624.89844.564.102May 28, 2024Sep 26, 2024Jan 25, 2025May 26, 2025Sep 24, 2025Jan 23, 2026May 25, 2026
Updated 8d ago

Evaluation Results

MethodLinks
2026.01
80.8-
2024.05
74.24-
2026.01
73.2-
2026.01
71.6440.7
2026.01
71.2114.1
2026.01
70.9118.8
2026.01
70.6137.9
2024.05
70.51-
2026.01
68.7-
2026.01
68.5312.7
2026.01
67.1132.5
2026.01
65.9166.3
2026.01
65.2108.4
2024.05
64.2-
2026.01
64112.7
2026.01
62132.5
2026.01
62117.6
2026.01
61.9-
2026.01
61.3138.8
2024.05
60.2-
2026.01
60-
2025.07
58.5-
2025.07
53.5-
2025.07
53.5-
2025.07
53-
2024.05
51.7-
2024.05
51.3-
2025.07
48-
2025.07
47.7-
2026.01
43.9218.3
2026.05
30.9-
2026.05
29.8-
2026.05
29.7-
2026.05
29.5-
2026.04
28.4-
2026.04
27.8-
2026.04
27.3-
2026.04
26.9-
2026.04
26.6-
2026.05
26.6-
2026.04
26.3-
2026.04
24.3-
2026.04
23.2-
2026.04
21.6-
2026.05
21.1-
2026.04
20.6-
2026.04
20-
2026.04
17.5-
2026.04
8.2-