Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multimodal Evaluation on LLaVA-bench in-the-wild

121.88Score

Self-Aug

32.54455.73778.93102.123Jan 11, 2024Apr 27, 2024Aug 12, 2024Nov 27, 2024Mar 14, 2025Jun 29, 2025Oct 15, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.10
121.88
2025.10
121.06
2025.10
119.76
2025.10
118.5
2024.01
97.3
2024.01
94.1
2024.01
90.3
2024.01
88.7
2024.01
86.8
2024.08
80.1
2024.08
77.6
2025.10
76.24
2025.10
74.56
2024.11
73.9
2025.10
73.62
2024.08
73
2024.01
70.9
2024.08
70.7
2024.01
70.7
2024.11
70.5
2024.08
69.7
2025.10
69.48
2025.10
69.22
2024.06
69.2
2025.10
69.12
2025.10
69.08
2024.06
68.2
2024.06
67.7
2024.06
67.5
2024.06
67.3
2024.06
65.9
2024.06
65.7
2024.11
65.6
2024.06
65.5
2024.11
65.4
2024.11
64.7
2024.06
64.3
2024.08
63.4
2024.01
63.4
2024.01
63.4
2024.11
60.9
2024.08
60.9
2024.01
60.9
2024.11
58.7
2025.10
58.48
2025.10
58.3
2024.01
58.2
2025.10
56.98
2025.10
56.82
2025.10
53.24
2025.10
39.84
2025.10
39.18
2025.10
38.78
2024.08
38.1
2024.01
38.1
2025.10
35.98