Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-image Understanding on MuirBench Multi-image Understanding

62.3Accuracy

GPT-4V

-2.0572814.6508631.35948.06714Dec 16, 2025Dec 17, 2025Dec 19, 2025Dec 21, 2025Dec 23, 2025Dec 25, 2025Dec 27, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.12
62.3
2025.12
58.2
2025.12
51.2
2025.12
50.2
2025.12
48.5
2025.12
48.3
2025.12
46.5
2025.12
45.1
2025.12
44.8
2025.12
40.5
2025.12
40.3
2025.12
39.9
2025.12
0.68
2025.12
0.551
2025.12
0.512
2025.12
0.483
2025.12
0.418