Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multimodal Visual Question Answering on VLMEvalKit Image Benchmarks

46.8HallBench Accuracy

Qwen2.5-VL-7B

32.86436.48240.143.718Apr 9, 2026
Updated 9d ago

Evaluation Results

MethodLinks
2026.04
46.82,32285.186.380.366.35870.772.285.3100
2026.04
44.12,34285.384.179.665.457.37172.68599
2026.04
43.52,25881.880.378.763.855.1697184.496.4
2026.04
42.52,32784.282.878.964.855.767.171.382.696.8
2026.04
42.42,31884.182.778.864.755.867.871.58497
2026.04
42.42,24583.777.277.763.755.570.371.982.996
2026.04
40.52,13779.264.47559.449.467.870.277.890
2026.04
40.12,24582.47377.562.254.36170.779.492.5
2026.04
39.82,16275.86976.460.553.565.470.183.591.3
2026.04
392,23381.571.476.262.554.165.670.981.292.6
2026.04
37.22,12777.859.27459.149.454.369.375.685.9
2026.04
33.82,02873.854.371.457.846.762.769.475.883.9
2026.04
33.41,97966.652.572.157.948.457.868.681.882.8