Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multimodal Reasoning on MMStar (Acc., AvgLen, Ratio)

75.2Accuracy

Octopus-8B (Ours)

49.012855.811462.6169.4086Feb 9, 2026Feb 10, 2026
Updated 3d ago

Evaluation Results

MethodLinks
2026.02
75.2--
2026.02
74.7--
2026.02
74.1--
2026.02
73.7--
2026.02
73.3--
2026.02
73.1--
2026.02
72.9--
2026.02
72.7--
2026.02
72.5--
2026.02
72.1--
2026.02
70--
2026.02
69.7--
2026.02
69.3--
2026.02
68.25--
2026.02
65.1--
2026.02
64.7--
2026.02
5986.81.5
2026.02
58.3422.67.2
2026.02
58.1225.53.9
2026.02
57.97--
2026.02
57.72294
2026.02
57.42--
2026.02
56.8--
2026.02
56.487.51.6
2026.02
56.22484.4
2026.02
55.1452.78.2
2026.02
54.1211.13.9
2026.02
50.74--
2026.02
50.02--