Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on GSM8K (test) (Acc, Steer, UTS, Entropy)
Loading...
91.8
Accuracy
UTS-guided system
88.7216
89.5208
90.32
91.1192
Feb 6, 2026
Accuracy
Steerability
UTS
Entropy
Updated 3mo ago
Evaluation Results
Method
Method
Links
Accuracy
Steerability
UTS
Entropy
UTS-guided system
Temperature (T)=1.0
2026.02
91.8
1.72
1.612
0.0006
UTS-guided system
Temperature (T)=0.5
2026.02
91.65
1.45
1.613
0.0001
UTS-guided system
Temperature (T)=1.5
2026.02
91.11
1.85
1.61
0.0015
UTS-guided system
Temperature (T)=2.0
2026.02
90.32
2.1
1.61
0.0018
UTS-guided system
Temperature (T)=2.5
2026.02
90.31
2.28
1.605
0.0068
UTS-guided system
Temperature (T)=3.0
2026.02
88.84
2.44
1.603
0.0058
Feedback
Search any
task
Search any
task