Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Calibration on Calibration Evaluation Set
Loading...
0.102
ECE
Base Model
0.09568
0.13834
0.181
0.22366
Jan 22, 2026
ECE
Updated 4d ago
Evaluation Results
Method
Method
Links
ECE
Base Model
Backbone=Qwen2.5-3B
2026.01
0.102
RKL-GRPO
Backbone=Qwen2.5-3B, A...
2026.01
0.125
RKL-DAPO
Backbone=Qwen2.5-3B, A...
2026.01
0.129
RKL-GSPO
Backbone=Qwen2.5-3B, A...
2026.01
0.131
CARE-GRPO
Backbone=Qwen2.5-3B, A...
2026.01
0.132
CARE-DAPO
Backbone=Qwen2.5-3B, A...
2026.01
0.134
CARE-GSPO
Backbone=Qwen2.5-3B, A...
2026.01
0.139
GRPO (No Constraint)
Backbone=Qwen2.5-3B, A...
2026.01
0.21
DAPO (No Constraint)
Backbone=Qwen2.5-3B, A...
2026.01
0.24
GSPO (No Constraint)
Backbone=Qwen2.5-3B, A...
2026.01
0.26
Feedback
Search any
task
Search any
task