Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Hallucination Evaluation on MMHal-Bench

4.7MMHal Score

Qwen2.5-VL + FINER-Tuning

2.362.96753.5754.1825Nov 13, 2023Apr 8, 2024Sep 2, 2024Jan 27, 2025Jun 23, 2025Nov 17, 2025Apr 14, 2026
Updated 3d ago

Evaluation Results

MethodLinks
4.715
2026.03
4.711
4.710
2026.03
4.618
4.614
2026.03
4.519
2026.01
4.3234.38
2026.03
4.1226
2026.01
4.0918.2
2026.03
429
2026.03
3.9529
2026.03
3.8733
2026.03
3.8331
2026.01
3.837.5
2026.01
3.7636.46
2026.03
3.7435
2026.01
3.7239.1
2026.01
3.724.9
2025.12
3.723
2026.03
3.737
2026.01
3.6836.46
2026.01
3.5640.63
2026.01
3.5539.58
2026.01
3.5339.58
2026.03
3.534
3.540
2024.05
3.4928.1
2025.03
3.4928
2024.05
3.4426
2024.05
3.3134.4
2025.12
3.3132
2026.03
3.343
2025.12
3.2129
2024.05
3.1932.3
2024.05
3.1532.3
2025.12
3.1236
2026.01
3.1145.63
2026.01
3.149
2024.05
3.0838.5
2024.05
3.0836.5
2026.01
3.0847
2024.05
3.0728.1
2026.01
3.0748
2026.01
3.0747
2024.05
3.0636.5
2026.01
3.0629.2
2023.12
3.02-
2025.12
3.0242
2024.05
2.9532.3
2025.12
2.9341
2026.01
2.9253
2025.12
2.945
2.89-
2025.05
2.8836
2025.12
2.8841
2026.01
2.8318.8
2026.01
2.8357
2025.12
2.8345
2026.04
2.8349
2026.01
2.8257
2023.12
2.81-
2025.05
2.840
2026.01
2.7917.5
2024.05
2.7638.5
2026.01
2.7638.5
2025.12
2.7546
2025.06
2.72-
2026.01
2.7219.3
2026.04
2.7251
2026.02
2.7156.1
2026.04
2.754.2
2024.05
2.66-
2026.04
2.6658.3
2025.12
2.6548
2023.11
2.6448
2025.12
2.6447
2025.06
2.62-
2025.05
2.6243
2024.05
2.61-
2025.03
2.6150
2023.11
2.649
2025.06
2.59-
2023.12
2.59-
2023.11
2.5452
2025.05
2.5457
2025.12
2.5447
2025.12
2.5457
2026.02
2.5458.9
2026.04
2.5459.4
2025.06
2.53-
2023.11
2.5357
2024.05
2.53-
2026.04
2.5257.3
2026.03
2.5150
2025.05
2.557
2025.05
2.4845
2026.01
2.4656
2026.01
2.4663.54
2024.05
2.4551
2026.01
2.4551
Showing 100 of 216 rows