Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Causal Reasoning on CaLM Mathematical
Loading...
93.5
Accuracy
GRPO
24.34
42.295
60.25
78.205
Feb 6, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
GRPO
Training=GRPO
2026.02
93.5
Base
Training=Base
2026.02
27.2
GPT-3.5-Turbo
Context=Best Performan...
2026.02
27
Feedback
Search any
task
Search any
task