Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Expected Calibration Error on MusQ
Loading...
48.08
ECE
CDKC
46.6012
56.5831
66.565
76.5469
Feb 13, 2026
ECE
Updated 4d ago
Evaluation Results
Method
Method
Links
ECE
CDKC
Backbone=Qwen2.5-7B-In...
2026.02
48.08
CDKC
Backbone=Qwen2.5-7B-In...
2026.02
53.01
GRPO
Backbone=Qwen2.5-7B-In...
2026.02
54.79
Know What
Backbone=Qwen2.5-7B-In...
2026.02
76.15
BARREL
Backbone=Qwen2.5-7B-In...
2026.02
78.61
CGKE
Backbone=Qwen2.5-7B-In...
2026.02
80.34
LLKD-SFT
Backbone=Qwen2.5-7B-In...
2026.02
80.68
Vanilla SFT
Backbone=Qwen2.5-7B-In...
2026.02
80.83
Vanilla LLM
Backbone=Qwen2.5-7B-In...
2026.02
81.61
CRew-DPO
Backbone=Qwen2.5-7B-In...
2026.02
85.05
Feedback
Search any
task
Search any
task