Share your thoughts, 1 month free Claude Pro on usSee more

Reasoning and Question Answering on BoolQ, RTE, HellaSWAG, ARC, OpenBookQA, and PiQA

67.24Avg Accuracy

Before finetune

Updated 4mo ago

Evaluation Results

Method	Links
Before finetune 2024.06		67.24
Target LLM 2024.06		66.93
ULD 2024.06		66.85
NPO+GD 2024.06		61.77
NPO+KL 2024.06		61.14
Offset-NPO+KL 2024.06		58.72
GA+GD 2024.06		58.34
Offset-DPO+KL 2024.06		56.59
DPO+KL 2024.06		56.34
GA+KL 2024.06		55.41
NPO 2024.06		54.73
DPO+GD 2024.06		53.91
Offset-GA+KL 2024.06		53.78
DPO 2024.06		48.12
GA 2024.06		35.59