Share your thoughts, 1 month free Claude Pro on usSee more

LLM Alignment Evaluation on Qwen2.5-14B-Instruct High-Variance (Top 20%)

5.67Average Reward (μ)

Base (Best-of-K)

Updated 4mo ago

Evaluation Results

Method	Links
Base (Best-of-K) 2026.03		5.67	-	-4.74	4.85
DARC-ϵ 2026.03		5.49	-	-1.85	5.19
DARC 2026.03		5.44	-	-3.14	5.1
CVaR (Best-of-K) 2026.03		5.41	-	-4.59	4.99
DARC-τ 2026.03		5.41	-	-2.91	5.12
2nd-Moment (LCB) 2026.03		5.39	-	-3.67	5.04