Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on LiveMathBench
Loading...
74.38
Accuracy
SMCS
-0.8952
18.6474
38.19
57.7326
Jul 14, 2025
Jul 31, 2025
Aug 18, 2025
Sep 5, 2025
Sep 23, 2025
Oct 11, 2025
Oct 29, 2025
Accuracy
Updated 8d ago
Evaluation Results
Method
Method
Links
Accuracy
SMCS
2025.07
74.38
Self-MoA
2025.07
60.33
GPT-4.1
2025.07
59.5
QwQ-32B
2025.07
58.68
GSPO + TAC
Backbone=DeepSeek-R1-D...
2025.10
24
DRGRPO
Backbone=DeepSeek-R1-D...
2025.10
21
DeepMath
Backbone=DeepSeek-R1-D...
2025.10
17
VeriThinker
Backbone=DeepSeek-R1-D...
2025.10
14
GSPO + TAC
Backbone=DeepSeek-R1-D...
2025.10
14
BASELINE
Backbone=DeepSeek-R1-D...
2025.10
13
GRPO-LEAD
Backbone=DeepSeek-R1-D...
2025.10
13
ExGRPO
Backbone=DeepSeek-R1-D...
2025.10
13
SkyWork
Backbone=DeepSeek-R1-D...
2025.10
12
BASELINE
Backbone=DeepSeek-R1-D...
2025.10
11
ExGRPO
Backbone=DeepSeek-R1-D...
2025.10
10
Open-RS3
Backbone=DeepSeek-R1-D...
2025.10
6
DRA-GRPO
Backbone=DeepSeek-R1-D...
2025.10
5
STILL-3
Backbone=DeepSeek-R1-D...
2025.10
3
Eurus-2
Backbone=DeepSeek-R1-D...
2025.10
2
Feedback
Search any
task
Search any
task