Difficulty-Controllable Cloze Question Distractor Generation
About
Multiple-choice cloze questions are commonly used to assess linguistic proficiency and comprehension. However, generating high-quality distractors remains challenging, as existing methods often lack adaptability and control over difficulty levels, and the absence of difficulty-annotated datasets further hinders progress. To address these issues, we propose a novel framework for generating distractors with controllable difficulty by leveraging both data augmentation and a multitask learning strategy. First, to create a high-quality, difficulty-annotated dataset, we introduce a two-way distractor generation process to produce diverse and plausible distractors. These candidates are filtered and then categorized by difficulty using an ensemble QA system. Second, this newly created dataset is used to train a difficulty-controllable generation model via multitask learning. Experimental results demonstrate that our method generates high-quality distractors across difficulty levels and substantially outperforms GPT-4o in aligning distractor difficulty with human perception.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Distractor Generation | Cloth | Hardest Accuracy73.25 | 18 | |
| Distractor Generation | Cloth (test) | Invalid Ratio (Easy)0.1 | 6 | |
| Distractor Generation | CLOTH original (test) | F1@1013.23 | 6 | |
| Distractor Generation | CLOTH Easy | Invalid Ratio0.00e+0 | 5 | |
| Distractor Generation | CLOTH Hard | Invalid Ratio4.2 | 5 | |
| Distractor Generation | CLOTH Easy augmented (test) | F1@1026.64 | 4 | |
| Distractor Generation | CLOTH Hard Augmented (test) | F1@1041.98 | 4 | |
| Multiple Choice Question Distractor Generation | Cloth (test) | Chosen Ratio9.4 | 4 |