Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multimodal Benchmarking on MMBench (Score and Relative Performance)

83.33MMBench Score

Training-Free Debiasing Inference Strategy

35.978848.271960.56572.8581Mar 6, 2026Mar 16, 2026Mar 26, 2026Apr 6, 2026Apr 16, 2026Apr 26, 2026May 7, 2026
Updated 22d ago

Evaluation Results

MethodLinks
2026.05
83.33-
2026.05
83.2-
2026.05
82.7-
2026.05
82.55-
2026.05
82.15-
2026.05
81.13-
2026.04
72.2-
2026.04
68.1-
2026.04
66.5-
2026.04
66.3-
2026.04
66.2-
2026.04
65.7-
2026.03
65.2101.4
2026.03
65101.1
2026.03
64.7100.6
2026.04
64.7100
2026.03
64.3-
2026.03
64.199.7
2026.04
64.1-
2026.03
6499.5
2026.03
6499.5
2026.03
63.899.2
2026.04
63.8-
2026.04
63.898.5
2026.05
63.7-
2026.05
63.65-
2026.04
63.599
2026.05
63.3-
2026.05
63.1-
2026.03
6398
2026.03
6398
2026.04
62.797.3
2026.05
62.7-
2026.04
62.697.3
2026.04
62.696.9
2026.03
62.597.2
2026.03
62.296.7
2026.03
6296.4
2026.04
6296.4
2026.04
6296.3
2026.05
62-
2026.04
61.595.1
2026.04
61.394.3
2026.04
61-
2026.04
60.894.8
2026.04
60.795.4
2026.04
60.496
2026.04
60.193
2026.04
6092.6
2026.04
59.591.9
2026.04
59.393.9
2026.04
58.588
2026.04
58.489.7
2026.04
57.787.8
2026.04
57.690.5
2026.04
56.285.6
2026.04
56.182.1
2026.04
51.477.5
2026.04
4871.4
2026.04
37.858.6