Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on GSM8K (Selection Accuracy)
Loading...
33.07
Selection Accuracy
Self-Certainty (SC)
31.6764
32.0382
32.4
32.7618
Jan 20, 2026
Selection Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Selection Accuracy
Self-Certainty (SC)
Perturbation=none
2026.01
33.07
Contrastive Causality (CC)
Perturbation=none
2026.01
32.97
Self-Certainty (SC)
Perturbation=shuffling
2026.01
32.23
Self-Certainty (SC)
Perturbation=small-eval
2026.01
32.17
Contrastive Causality (CC)
Perturbation=small-eval
2026.01
32.13
Contrastive Causality (CC)
Perturbation=shuffling
2026.01
31.73
Feedback
Search any
task
Search any
task