Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Mathematical Reasoning on MATH (OOD) (Calibration Metrics)
Loading...
71
KL Div (alpha=1)
GEB-arctanh(π − 1)
67.464
68.382
69.3
70.218
Sep 27, 2025
KL Div (alpha=1)
Hellinger Dist (alpha=0.5)
f-KL (alpha=0)
Average Score
Updated 4d ago
Evaluation Results
Method
Method
Links
KL Div (alpha=1)
Hellinger Dist (alpha=0.5)
f-KL (alpha=0)
Average Score
GEB-arctanh(π − 1)
Backbone=LLaMA-3-8B-SF...
2025.09
71
67.6
69.2
69.3
GEB-π
Backbone=LLaMA-3-8B-SF...
2025.09
69.2
69.6
71.6
70.1
GEB-1/π
Backbone=LLaMA-3-8B-SF...
2025.09
68.4
70.2
69.2
69.3
f-DPO
Backbone=LLaMA-3-8B-SF...
2025.09
67.6
69
69.2
68.6
FEB
Backbone=LLaMA-3-8B-SF...
2025.09
67.6
68.6
68.6
68.3
Feedback
Search any
task
Search any
task