TAG-INSTRUCT: Controlled Instruction Complexity Enhancement through Structure-based Augmentation

About

High-quality instruction data is crucial for developing large language models (LLMs), yet existing approaches struggle to effectively control instruction complexity. We present TAG-INSTRUCT, a novel framework that enhances instruction complexity through structured semantic compression and controlled difficulty augmentation. Unlike previous prompt-based methods operating on raw text, TAG-INSTRUCT compresses instructions into a compact tag space and systematically enhances complexity through RL-guided tag expansion. Through extensive experiments, we show that TAG-INSTRUCT outperforms existing instruction complexity augmentation approaches. Our analysis reveals that operating in tag space provides superior controllability and stability across different instruction synthesis frameworks.

He Zhu, Zhiwen Ruan, Junyou Su, Xingwei He, Yun Chen, Wenjia Zhang, Guanhua Chen• 2025

Related benchmarks

Task	Dataset	Result
Instruction Following	MT-Bench	MT-Bench Score6.15	287
Instruction Following	AlpacaEval 2.0 (test)	LC Win Rate (%)19.5	95
General Instruction Following	Arena Hard	Score22.1	46

Showing 3 of 3 rows

Other info

Code

Follow for update

@wizwand_team Discord