Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning

About

In the realm of Large Language Models (LLMs), the balance between instruction data quality and quantity is a focal point. Recognizing this, we introduce a self-guided methodology for LLMs to autonomously discern and select cherry samples from open-source datasets, effectively minimizing manual curation and potential cost for instruction tuning an LLM. Our key innovation, the Instruction-Following Difficulty (IFD) metric, emerges as a pivotal metric to identify discrepancies between a model's expected responses and its intrinsic generation capability. Through the application of IFD, cherry samples can be pinpointed, leading to a marked uptick in model training efficiency. Empirical validations on datasets like Alpaca and WizardLM underpin our findings; with a mere $10\%$ of original data input, our strategy showcases improved results. This synthesis of self-guided cherry-picking and the IFD metric signifies a transformative leap in the instruction tuning of LLMs, promising both efficiency and resource-conscious advancements. Codes, data, and models are available: https://github.com/tianyi-lab/Cherry_LLM

Ming Li, Yong Zhang, Zhitao Li, Jiuhai Chen, Lichang Chen, Ning Cheng, Jianzong Wang, Tianyi Zhou, Jing Xiao• 2023

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText2
Perplexity10.17
1875
Language ModelingWikiText-2 (test)
PPL18.54
1541
Commonsense ReasoningHellaSwag
Accuracy75.59
1460
Visual Question AnsweringVQA v2
Accuracy65
1165
Object Hallucination EvaluationPOPE
Accuracy82.6
935
Language ModelingWikiText-2
Perplexity (PPL)13.51
841
Commonsense ReasoningWinoGrande
Accuracy68.63
776
Language ModelingPTB
Perplexity19.31
650
Commonsense ReasoningPIQA
Accuracy77.66
647
Multimodal EvaluationMME--
557
Showing 10 of 79 rows
...

Other info

Follow for update