Share your thoughts, 1 month free Claude Pro on usSee more

LLM Alignment Evaluation on Human Evaluation

4.42Coherence Score

Hard-Pair-GRPO

Updated 2mo ago

Evaluation Results

Method	Links
Hard-Pair-GRPO 2026.05		4.42	4.4	4.51	4.45	4.45
ORPO 2026.05		4.28	4.26	4.36	4.32	4.31
DPO 2026.05		4.25	4.22	4.33	4.29	4.27
Soft-Pair-GRPO 2026.05		4.2	4.15	4.28	4.23	4.22
Standard GRPO 2026.05		4.12	4.05	4.21	4.18	4.14