Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Autoformalization on PutnamBench
Loading...
0.561
Mean Cycle Consistency
RL (GRPO) 2B
-0.01828
0.13211
0.2825
0.43289
Mar 25, 2026
Mean Cycle Consistency
Updated 23d ago
Evaluation Results
Method
Method
Links
Mean Cycle Consistency
RL (GRPO) 2B
Model Scale=2B, Traini...
2026.03
0.561
SFT Curriculum 9B
Model Scale=9B, Traini...
2026.03
0.548
SFT No-Curriculum 2B
Model Scale=2B, Traini...
2026.03
0.432
SFT Curriculum 2B
Model Scale=2B, Traini...
2026.03
0.422
Base 2B
Model Scale=2B, Evalua...
2026.03
0.034
Base 9B
Model Scale=9B, Evalua...
2026.03
0.004
Feedback
Search any
task
Search any
task