Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
STEM Reasoning on SAT
Loading...
0.893
Accuracy
AGPO
0.75572
0.79136
0.827
0.86264
May 20, 2026
Accuracy
Updated 13d ago
Evaluation Results
Method
Method
Links
Accuracy
AGPO
clipping=adaptive, ATS...
2026.05
0.893
AGPO
clipping=adaptive, ATS...
2026.05
0.886
GRPO
ATS=true
2026.05
0.879
Adaptive-KL PPO
2026.05
0.843
GRPO
clipping=fixed ε
2026.05
0.806
DPO
2026.05
0.777
PPO
2026.05
0.761
Feedback
Search any
task
Search any
task