Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM Alignment Evaluation on Qwen2.5-14B-Instruct High-Variance (Top 20%)
Loading...
5.67
Average Reward (μ)
Base (Best-of-K)
5.3788
5.4544
5.53
5.6056
Mar 9, 2026
Average Reward (μ)
Average Risk (σ^)
Risk-Reward Tradeoff
CVaR 10%
Updated 1mo ago
Evaluation Results
Method
Method
Links
Average Reward (μ)
Average Risk (σ^)
Risk-Reward Tradeoff
CVaR 10%
Base (Best-of-K)
Base Model=Qwen2.5-14B...
2026.03
5.67
-
-4.74
4.85
DARC-ϵ
Base Model=Qwen2.5-14B...
2026.03
5.49
-
-1.85
5.19
DARC
Base Model=Qwen2.5-14B...
2026.03
5.44
-
-3.14
5.1
CVaR (Best-of-K)
Base Model=Qwen2.5-14B...
2026.03
5.41
-
-4.59
4.99
DARC-τ
Base Model=Qwen2.5-14B...
2026.03
5.41
-
-2.91
5.12
2nd-Moment (LCB)
Base Model=Qwen2.5-14B...
2026.03
5.39
-
-3.67
5.04
Feedback
Search any
task
Search any
task