Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLM Alignment Evaluation on Qwen2.5-14B-Instruct High-Variance (Top 20%)

5.67Average Reward (μ)

Base (Best-of-K)

5.37885.45445.535.6056Mar 9, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
5.67--4.744.85
2026.03
5.49--1.855.19
2026.03
5.44--3.145.1
2026.03
5.41--4.594.99
2026.03
5.41--2.915.12
2026.03
5.39--3.675.04