TAG-INSTRUCT: Controlled Instruction Complexity Enhancement through Structure-based Augmentation
About
High-quality instruction data is crucial for developing large language models (LLMs), yet existing approaches struggle to effectively control instruction complexity. We present TAG-INSTRUCT, a novel framework that enhances instruction complexity through structured semantic compression and controlled difficulty augmentation. Unlike previous prompt-based methods operating on raw text, TAG-INSTRUCT compresses instructions into a compact tag space and systematically enhances complexity through RL-guided tag expansion. Through extensive experiments, we show that TAG-INSTRUCT outperforms existing instruction complexity augmentation approaches. Our analysis reveals that operating in tag space provides superior controllability and stability across different instruction synthesis frameworks.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Instruction Following | MT-Bench | MT-Bench Score6.15 | 189 | |
| Instruction Following | AlpacaEval 2.0 (test) | LC Win Rate (%)19.5 | 71 | |
| General Instruction Following | Arena Hard | Score22.1 | 35 |