TaP: A Taxonomy-Guided Framework for Automated and Scalable Preference Data Generation
About
Conducting supervised and preference fine-tuning of large language models (LLMs) requires high-quality datasets to improve their ability to follow instructions and align with human preferences and values. However, constructing such datasets is resource-intensive, and most publicly available datasets are in English. To address these challenges, we propose the \underline{\textbf{Ta}}xonomy-Guided \underline{\textbf{P}}reference Data Generation (TaP) framework for automated, scalable preference dataset construction across languages. TaP uses a structured taxonomy to provide fine-grained control over dataset composition, ensuring diversity and broad coverage. We use TaP-generated datasets to perform supervised and preference fine-tuning on multiple LLMs. Experimental results demonstrate that LLMs trained on TaP-generated datasets outperform those trained on existing open-source datasets. Remarkably, LLMs trained on TaP-generated datasets outperform models trained on an open-source dataset that is 180$\times$ larger.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Dialogue Alignment Evaluation | AlignBench | Reasoning6.76 | 90 | |
| Multi-turn Dialogue Evaluation | MT-Bench-zh | Score6.34 | 90 | |
| Instruction Following | AlignBench | Reasoning Score7.42 | 60 | |
| Instruction Following | MT-Bench-zh | Score6.83 | 60 | |
| General LLM Evaluation | AlignBench | Reasoning Score7.27 | 20 | |
| General LLM Evaluation | MT-Bench-zh | Overall Score6.66 | 7 |