Share your thoughts, 1 month free Claude Pro on usSee more

General Reasoning on Timely-Eval

82.9MATH

DeepSeek-V3.2

Updated 5mo ago

Evaluation Results

Method	Links
DeepSeek-V3.2 2026.01		82.9	43.3	58.7
TimelyLM-8B 2026.01		78	42.5	49.5
Qwen3-32B 2026.01		75	45.7	35.5
GPT-5.1(medium) 2026.01		71.5	46.7	71
Qwen3-8B 2026.01		71.2	40	37.5
Qwen3-14B 2026.01		70.8	41.7	21
Gemini2.5-pro 2026.01		63	37.5	59