Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multimodal Reasoning on WeMath

72.2Accuracy

TGRL-DAPO

35.69645.17354.6564.127Jun 5, 2025Jul 25, 2025Sep 13, 2025Nov 3, 2025Dec 23, 2025Feb 11, 2026Apr 3, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.03
72.2--
2026.03
72--
2026.03
71.05--
2026.03
70.3--
2026.03
70.29--
2026.03
70.23--
2026.03
69.3--
2026.03
69.2--
2026.03
68.84--
2026.03
68.2--
2026.03
67.9--
2026.03
67.8--
2026.03
67.4--
2026.02
63.8562.68.8
2026.02
63.4419.56.6
2026.03
63.1--
2026.02
6397.71.6
2026.02
62.3--
2026.03
62.1--
2026.02
61.6313.55.1
2026.02
61.58--
2026.02
60.23--
2026.02
59.2--
2026.02
58.61--
2026.02
58.53--
2026.02
58.21--
2026.02
58.15--
2026.02
58.02--
2026.04
58--
2026.02
57.99--
2026.02
57.9--
2026.02
57.59--
2026.02
57.59--
2026.02
57.54--
2026.02
57.44--
2026.04
56.57--
2026.02
55.8572.510.3
2025.09
55.5--
2026.02
55.4460.58.3
2026.04
54.95--
2026.02
54.999.61.8
2026.02
54.89--
2026.04
54.76--
2026.02
54.6334.26.1
2026.02
54.14--
2026.04
54.1--
2026.02
52.82--
2026.02
52.41--
2026.04
52.19--
2025.06
52.1--
2025.06
51.9--
2025.06
50.8--
2026.02
49.5--
2025.06
49.3--
2025.09
49.3--
2025.06
49.2--
2025.06
49.1--
2025.06
49.1--
2025.06
48.6--
2025.06
48.5--
2026.02
47.7--
2025.06
47.4--
2025.06
46--
2025.06
45.8--
2026.02
45.6--
2025.06
45.4--
2025.06
45--
2026.02
43.3--
2025.06
43.1--
2025.06
43.1--
2025.06
43--
2025.11
41.8--
2026.03
41.2--
2026.02
40.7--
2026.03
40.7--
2025.06
40.7--
2026.02
39.6--
2025.09
39.4--
2026.02
39.3--
2025.11
39.3--
2025.09
39.3--
2025.06
39.1--
2026.03
39--
2025.06
39--
2026.02
38.9--
2026.02
38.9--
2025.11
38.9--
2025.06
38.9--
2025.09
38.9--
2026.02
38.8--
2025.06
38.7--
2025.06
38.4--
2026.02
38.1--
2026.02
38.1--
2025.11
38.1--
2025.06
37.9--
2025.06
37.3--
2025.11
37.1--
2025.06
37.1--
2025.06
37.1--
Showing 100 of 129 rows