Share your thoughts, 1 month free Claude Pro on usSee more

Instruction Following on AlpacaEval2 short-context

22.9AlpacaEval2 Score

officially post-trained

Updated 3mo ago

Evaluation Results

Method	Links
officially post-trained 2024.10		22.9
officially post-trained 2024.10		22.4
DPO w/ LongReward 2024.10		15.4
DPO w/ Contrast 2024.10		14.5
DPO w/ LongReward 2024.10		14.2
DPO w/ SRM 2024.10		14.2
DPO w/ Contrast 2024.10		13.8
DPO w/ SRM 2024.10		13.7
SFT 2024.10		12.5
SFT 2024.10		12.4