Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Hallucination Diagnosis on RSHBench

44.47Object Accuracy (OBJ)

Qwen3-VL-4B

25.136430.155735.17540.1943Mar 3, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
44.4731.8114.8261.1929.920.272.4315.6334.7761.19
2026.03
40.728.311.3255.5320.490.271.0814.2924.5356.33
2026.03
35.8525.8812.1351.4826.150.273.7715.3629.1152.02
2026.03
35.3121.029.1648.7921.0200.549.1622.9149.6
2026.03
33.4231.5415.3648.7927.490.540.8115.0929.6549.06
2026.03
33.4233.9615.949.8728.33.772.1618.0629.6549.87
2026.03
30.1926.6811.3246.919.410.270.2711.0521.0247.44
2026.03
28.0325.6113.4838.5421.832.161.8915.6324.838.81
2026.03
25.8829.1114.2946.925.612.962.716.7126.9547.71