Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Scientific Reasoning & QA on HLE
Loading...
3.61
Accuracy
Reinforce++
2.8612
3.0556
3.25
3.4444
Dec 3, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Reinforce++
Training Domain=Math D...
2025.12
3.61
DVPO
Training Domain=Math D...
2025.12
3.57
Robust Bellman
Training Domain=Math D...
2025.12
3.43
PPO
Training Domain=Math D...
2025.12
3.34
Dr.GRPO
Training Domain=Math D...
2025.12
3.2
GRPO
Training Domain=Math D...
2025.12
3.01
Base
Training Domain=Math D...
2025.12
2.89
Feedback
Search any
task
Search any
task