Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Cross-modal hallucination evaluation on AVHBench

88.19Overall Accuracy

Qwen 2.5 Omni + MoD-DPO++

60.286867.530974.77582.0191Jan 29, 2026Feb 3, 2026Feb 9, 2026Feb 14, 2026Feb 20, 2026Feb 25, 2026Mar 3, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
88.19--86.3590.0288.15
2026.03
87.66--85.5389.7887.61
2026.03
87.26--88.8985.6387.23
2026.03
87.08--89.2884.8787.02
2026.03
85.34--75.6195.0684.23
2026.03
84.96--86.1883.7584.95
2026.03
84.39--75.4493.383.42
2026.03
84.15--76.7691.5583.51
2026.03
83.54--85.9281.1783.47
2026.03
83.36--85.5681.1683.3
2026.03
82.91--84.5681.2682.88
2026.01
81.678.784.4---
2026.01
79.479.779.1---
2026.01
79.378.380.3---
2026.03
79.23--81.5176.9479.16
2026.01
77.875.879.7---
2026.01
77.475.779---
2026.01
76.97380.7---
2026.01
73.770.377.1---
2026.01
70.46674.8---
2026.03
67.17--78.1756.1765.36
2026.03
61.36--57.9264.7961.16