Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multimodal Evaluation on LLaVA-bench in-the-wild

121.88Score

Self-Aug

32.54455.73778.93102.123Jun 1, 2023Nov 29, 2023May 29, 2024Nov 27, 2024May 27, 2025Nov 25, 2025May 26, 2026
Updated 7d ago

Evaluation Results

MethodLinks
2025.10
121.88
2025.10
121.06
2025.10
119.76
2025.10
118.5
2024.01
97.3
2024.01
94.1
2024.01
90.3
2024.01
88.7
2024.01
86.8
2024.08
80.1
2023.06
78.3
2024.08
77.6
2023.06
77.6
2025.10
76.24
2023.06
75.8
2023.06
75.2
2025.10
74.56
2024.11
73.9
2025.10
73.62
2024.08
73
2024.01
70.9
2024.08
70.7
2024.01
70.7
2024.11
70.5
2024.08
69.7
2025.10
69.48
2025.10
69.22
2024.06
69.2
2025.10
69.12
2025.10
69.08
2024.06
68.2
2026.05
67.9
2024.06
67.7
2024.06
67.5
2024.06
67.3
2026.05
66.7
2024.06
65.9
2026.05
65.8
2024.06
65.7
2024.11
65.6
2026.05
65.6
2024.06
65.5
2024.11
65.4
2026.05
65.3
2026.05
65.2
2026.05
64.9
2024.11
64.7
2024.06
64.3
2024.08
63.4
2024.01
63.4
2024.01
63.4
2026.05
62.4
2026.05
62.3
2026.05
61.3
2024.11
60.9
2024.08
60.9
2024.01
60.9
2026.05
60.6
2026.05
59
2026.05
58.8
2024.11
58.7
2025.10
58.48
2025.10
58.3
2024.01
58.2
2025.10
56.98
2025.10
56.82
2025.10
53.24
2025.10
39.84
2025.10
39.18
2025.10
38.78
2024.08
38.1
2024.01
38.1
2025.10
35.98