Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Open-ended Instruction Following on AlpacaEval GPT-5.2-judged (test)

64.4Win Rate

Hybrid KD

57.22459.08760.9562.813May 25, 2026
Updated 7d ago

Evaluation Results

MethodLinks
2026.05
64.4
2026.05
61.3
2026.05
57.5