Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Competition Mathematics Reasoning on MATH500
Loading...
13.7
Full Length
Minimal-core extraction
12.244
12.622
13
13.378
May 14, 2026
Full Length
Core Length
CR
RM
Top-3 Mass
Retention
Updated 19d ago
Evaluation Results
Method
Method
Links
Full Length
Core Length
CR
RM
Top-3 Mass
Retention
Minimal-core extraction
Model=GPT-5
2026.05
13.7
6.5
47
53
69
90
Minimal-core extraction
Model=DeepSeek-R1-Dist...
2026.05
13.2
7
53
47
64
88
Minimal-core extraction
Model=Qwen3-32B
2026.05
12.8
7.1
55
45
63
87
Minimal-core extraction
Model=DeepSeek-R1-Dist...
2026.05
12.3
7.3
59
41
59
84
Feedback
Search any
task
Search any
task