Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multimodal Conversation on LLaVA-Bench Wild

102Score

GPT-4o-0513

26.91246.40665.985.394Oct 5, 2023Feb 28, 2024Jul 23, 2024Dec 17, 2024May 12, 2025Oct 5, 2025Mar 1, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
102
95.3
2026.01
82.7
82
2026.01
81.2
81
2024.09
81
2026.01
80
79.9
2026.01
79.7
2026.01
77.9
2026.01
77.8
2026.01
76.8
2026.01
76.8
2026.01
76.7
2026.01
75.6
2026.01
74.8
2026.01
74.7
2026.01
74.1
2026.01
74
2026.01
73.9
2026.01
73.4
2024.01
73.3
2024.01
73.3
73.3
2023.10
72.5
2026.01
72.5
2026.01
72.5
2023.10
72
2026.01
72
2024.06
71.3
2024.09
71.1
2024.06
71
2026.01
70.9
2024.01
70.7
2024.09
70.2
2026.01
69.7
2024.09
69.3
66.3
2023.10
65.4
2023.10
62.8
2023.12
62.6
2023.10
60.9
2023.12
59.1
2023.10
58.2
2026.01
56.8
2026.01
55.5
2026.01
54.1
2026.01
53.2
2026.01
52.8
2026.01
51.6
2026.03
39.5
2026.03
39.4
2023.10
38.1
2026.03
37.6
2026.03
35.7
2026.03
35.4
2026.03
34.6
2026.03
34.2
2026.03
33.5
2026.03
32.8
2026.03
32.1
2026.03
31.2
2026.03
30.4
2026.03
29.8