Share your thoughts, 1 month free Claude Pro on usSee more

Open-ended Instruction Following on Self-instruct Eval

53.6Win Rate (A)

BPO + Llama-2-chat 7B

Updated 3mo ago

Evaluation Results

Method	Links
BPO + Llama-2-chat 7B 2023.11		53.6	9.9	36.5	17.4
BPO + Llama-2-chat 13B 2023.11		51.2	11.9	36.9	18.1
BPO + Llama-2-chat 13B (Cross-size) 2023.11		48.4	4.8	46.8	11.9
BPO + Vicuna-v1.3 13B 2023.11		46.4	13.9	39.7	13.1
BPO + Llama-2-chat 70B 2023.11		46	13.1	40.9	16.8
BPO + Vicuna-v1.3 7B 2023.11		42	21.1	36.9	18.5
BPO + Llama-2-chat 7B (Cross-size) 2023.11		40.1	5.1	54.8	-7.1