Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM Alignment Evaluation on Qwen2.5-14B-Instruct Overall
Loading...
6.31
Reward (Avg μ)
Base (Best-of-K)
5.7068
5.8634
6.02
6.1766
Mar 9, 2026
Reward (Avg μ)
Risk (Avg σ̂)
Tradeoff
CVaR10%
Updated 1mo ago
Evaluation Results
Method
Method
Links
Reward (Avg μ)
Risk (Avg σ̂)
Tradeoff
CVaR10%
Base (Best-of-K)
Base Model=Qwen2.5-14B...
2026.03
6.31
3.14
0.03
5.11
DARC-ϵ
Base Model=Qwen2.5-14B...
2026.03
6.18
2.53
1.12
5.51
DARC-τ
Base Model=Qwen2.5-14B...
2026.03
6.11
2.71
0.69
5.43
DARC
Base Model=Qwen2.5-14B...
2026.03
5.92
2.73
0.46
5.38
2nd-Moment (LCB)
Base Model=Qwen2.5-14B...
2026.03
5.81
2.83
0.15
5.23
CVaR (Best-of-K)
Base Model=Qwen2.5-14B...
2026.03
5.73
3.01
-0.29
5.16
Feedback
Search any
task
Search any
task