Share your thoughts, 1 month free Claude Pro on usSee more

LLM Alignment on HH-RLHF 300 prompts

69.8Win/Tie Rate vs Vanilla (GPT-4o)

CARDS

Updated 3mo ago

Evaluation Results

Method	Links
CARDS 2025.11		69.8
CARDS 2025.11		64.5
STARS 2025.11		64.5
DPO 2025.11		60.5
STARS 2025.11		60.2
Tree-bon 2025.11		59.2
RAIN 2025.11		59
ARGS 2025.11		58.8
DPO 2025.11		56.4
Tree-bon 2025.11		55.2
RAIN 2025.11		55
ARGS 2025.11		54.8
Speculative-Decoding 2025.11		50.4
Speculative-Decoding 2025.11		50.2
Vanilla LLM 2025.11		50
Vanilla LLM 2025.11		50