Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multimodal Conversation on LLaVA-Bench Wild

102Score

GPT-4o-0513

26.91246.40665.985.394Oct 5, 2023Mar 13, 2024Aug 21, 2024Jan 29, 2025Jul 8, 2025Dec 16, 2025May 26, 2026
Updated 7d ago

Evaluation Results

MethodLinks
102
95.3
2026.01
82.7
82
2026.01
81.2
81
2024.09
81
2026.01
80
79.9
2026.01
79.7
2026.01
77.9
2026.01
77.8
2026.01
76.8
2026.01
76.8
2026.01
76.7
2026.01
75.6
2026.01
74.8
2026.01
74.7
2026.01
74.1
2026.01
74
2026.01
73.9
2026.01
73.4
2024.01
73.3
2024.01
73.3
73.3
2023.10
72.5
2026.01
72.5
2026.01
72.5
2023.10
72
2026.01
72
2024.06
71.3
2024.09
71.1
2024.06
71
2026.01
70.9
2024.01
70.7
2024.09
70.2
2026.01
69.7
2024.09
69.3
66.3
2023.10
65.4
2023.10
62.8
2023.12
62.6
2023.10
60.9
2023.12
59.1
2023.10
58.2
2026.01
56.8
2026.01
55.5
2026.01
54.1
2026.01
53.2
2026.01
52.8
2026.01
51.6
2026.05
40.8
2026.03
39.5
2026.03
39.4
2026.05
39.4
2023.10
38.1
2026.03
37.6
2026.05
37.6
2026.03
35.7
2026.05
35.7
2026.03
35.4
2026.05
35.4
2026.03
34.6
2026.05
34.6
2026.03
34.2
2026.05
34.2
2026.03
33.5
2026.05
33.5
2026.03
32.8
2026.05
32.8
2026.03
32.1
2026.05
32.1
2026.03
31.2
2026.05
31.2
2026.03
30.4
2026.05
30.4
2026.03
29.8
2026.05
29.8