Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multiple Choice Question Answering on Long-form Multiple Choice Question Answering benchmark

83.2Accuracy

Qwen2.5-VL

60.94466.72272.578.278Mar 26, 2026
Updated 22d ago

Evaluation Results

MethodLinks
2026.03
83.2
2026.03
82.6
2026.03
82.5
2026.03
82
2026.03
81.9
2026.03
64.4
2026.03
64.4
2026.03
64.2
2026.03
61.9
2026.03
61.8