Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on MATH 500 Hard
Loading...
80.8
Accuracy
ReAct
48.56
56.93
65.3
73.67
Dec 4, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
ReAct
Paradigm=Prompting, Ba...
2025.12
80.8
PPO
Paradigm=Fine-tuning,...
2025.12
73.4
NLAC
Paradigm=Fine-tuning,...
2025.12
72.7
GRPO
Paradigm=Fine-tuning,...
2025.12
71.8
NLRL
Paradigm=Fine-tuning,...
2025.12
71.5
NLQL
Paradigm=Fine-tuning,...
2025.12
71.2
Self-Distillation
Paradigm=Fine-tuning,...
2025.12
70.1
RFT
Paradigm=Fine-tuning,...
2025.12
69.4
Self-Distillation
Paradigm=Fine-tuning,...
2025.12
57.1
NLQL
Paradigm=Fine-tuning,...
2025.12
56.4
NLAC
Paradigm=Fine-tuning,...
2025.12
56.2
NLRL
Paradigm=Fine-tuning,...
2025.12
53.4
GRPO
Paradigm=Fine-tuning,...
2025.12
52.5
PPO
Paradigm=Fine-tuning,...
2025.12
52.3
RFT
Paradigm=Fine-tuning,...
2025.12
49.8
Feedback
Search any
task
Search any
task