Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-modal Reasoning on MUIRBENCH

92.94Difference Reasoning Accuracy

Human

16.167236.098656.0375.9614Feb 28, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.02
92.9493.1594.8797.5685.7194.8387.594.6282.0598.999887.7686.3
2026.02
68.2473.8181.6257.3253.5785.7834.3886.0251.2892.964562.7668.15
2026.02
62.6572.8182.4855.495084.740.6284.9552.5692.464864.867.47
60.59-------52.5679.155750.5164.04
2026.02
60.296849.1544.5136.986.8523.4471.5151.2888.695656.1280.14
2026.02
45.2949.3528.6335.9828.5766.5912.559.1447.4464.824841.3343.84
2026.02
34.4135.4329.9128.0513.135.9912.527.4148.7246.983532.6543.49
2026.02
33.5333.2326.9225.6123.8122.844.6939.7829.4944.722638.7847.6
2026.02
32.6533.6231.227.4426.1937.2815.6348.3943.5937.693431.6323.97
2026.02
28.8244.538.4633.5426.1953.8818.7556.9938.4667.592648.4735.62
2026.02
27.6526.0821.7926.2226.1924.7815.6256.4539.7425.382117.8617.12
2026.02
24.7133.1219.6628.662540.9510.9456.4530.7742.713124.4930.14
23.1823.9920.9823.412524.1222.81252529.56252021.3
2026.02
22.0633.3136.3226.2233.3337.9321.8854.341.0338.191238.2725
2026.02
21.7623.7321.7926.8330.9524.1421.8822.5825.6431.912518.8815.41
2026.02
2024.3825.2129.2714.2920.2620.3136.5625.6431.662022.9620.89
2017.3511.9714.022517.0318.7514.5221.7921.611317.3514.73
2026.02
19.7120.8514.126.2216.6721.3412.541.441.0319.61316.3315.75
2026.02
19.1228.1534.1926.2232.1425.657.8142.4739.7435.431223.9828.42