Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mathematical Reasoning on IMO-AnswerBench

84.5Accuracy

DeepSeek-V3.2

47.5857.16566.7576.335Dec 2, 2025Dec 22, 2025Jan 12, 2026Feb 2, 2026Feb 23, 2026Mar 16, 2026Apr 6, 2026
Updated 12d ago

Evaluation Results

MethodLinks
2025.12
84.545
2025.12
83.318
2026.04
83.2-
2026.02
83.118,000
2026.02
81.836,000
2026.02
78.637,000
2025.12
78.637
2026.02
78.327,000
2025.12
78.327
2026.04
76.5-
2025.12
7631
2026.04
75.8-
2026.04
70.5-
2026.04
70.5-
2026.04
67.5-
2026.04
67-
2026.04
61.5-
2026.04
57.5-
2026.04
55.8-
2026.04
49-