Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Hallucination Evaluation on MMHal-Bench

4.7MMHal Score

Qwen2.5-VL + FINER-Tuning

2.55763.11383.674.2262Nov 13, 2023Apr 14, 2024Sep 15, 2024Feb 16, 2025Jul 20, 2025Dec 21, 2025May 24, 2026
Updated 5d ago

Evaluation Results

MethodLinks
4.715
2026.03
4.711
4.710
2026.03
4.618
4.614
2026.03
4.519
2026.01
4.3234.38
2026.05
4.3232.6
2026.05
4.2835.4
2026.03
4.1226
2026.01
4.0918.2
2026.03
429
2026.03
3.9529
2026.03
3.8733
2026.03
3.8331
2026.01
3.837.5
2026.05
3.7725.3
2026.01
3.7636.46
2026.05
3.7529.7
2026.03
3.7435
2026.01
3.7239.1
2026.01
3.724.9
2025.12
3.723
2026.03
3.737
2026.01
3.6836.46
2026.01
3.5640.63
2026.01
3.5539.58
2026.01
3.5339.58
2026.03
3.534
3.540
2024.05
3.4928.1
2025.03
3.4928
2026.05
3.4928
2024.05
3.4426
2024.05
3.3134.4
2025.12
3.3132
2026.03
3.343
2025.12
3.2129
2024.05
3.1932.3
2024.05
3.1532.3
2025.12
3.1236
2026.01
3.1145.63
2026.01
3.149
2024.05
3.0838.5
2024.05
3.0836.5
2026.01
3.0847
2026.05
3.0837
2024.05
3.0728.1
2026.01
3.0748
2026.01
3.0747
2024.05
3.0636.5
2026.01
3.0629.2
2026.05
3.0438
2023.12
3.02-
2025.12
3.0242
2026.05
3.0142
2026.05
2.9938.4
2024.05
2.9532.3
2026.05
2.9532
2026.05
2.9438.4
2026.05
2.9442
2025.12
2.9341
2026.05
2.9341.1
2026.01
2.9253
2026.05
2.9143
2025.12
2.945
2.89-
2025.05
2.8836
2025.12
2.8841
2026.05
2.8739.5
2026.05
2.8442
2026.01
2.8318.8
2026.01
2.8357
2025.12
2.8345
2026.04
2.8349
2026.05
2.8345
2026.01
2.8257
2023.12
2.81-
2025.05
2.840
2026.05
2.848.8
2026.01
2.7917.5
2026.05
2.7739.2
2024.05
2.7638.5
2026.01
2.7638.5
2026.05
2.7647.6
2025.12
2.7546
2026.05
2.7448.8
2025.06
2.72-
2026.01
2.7219.3
2026.04
2.7251
2026.02
2.7156.1
2026.04
2.754.2
2026.05
2.6950.9
2026.05
2.6949
2024.05
2.66-
2026.04
2.6658.3
2025.12
2.6548
2024.11
2.6548
2023.11
2.6448
2025.12
2.6447
Showing 100 of 306 rows