Domain Specific Specialization in Low-Resource Settings: The Efficacy of Offline Response-Based Knowledge Distillation in Large Language Models
About
Large Language Models (LLMs) excel in general tasks but often struggle with hallucinations when handling domain-specific or institutional knowledge absent from their pre-training. We present an offline response-based knowledge distillation method that develops high-accuracy specialized assistants under constrained hardware resources. We evaluate three distinct data strategies: general domain adaptation (15,000 lines), unstructured knowledge injection (2,000 lines), and a context-aware synthetic dataset (500 lines) generated by a teacher model. To minimize computational costs, we utilize the Unsloth library to optimize the Qwen-2.5-7B student model, reducing NVIDIA A100 GPU memory requirements from 40 GB to 16 GB. Experimental results demonstrate that while larger unstructured datasets suffer from persistent hallucinations, the 500-line context-aware dataset achieves a 96.7% accuracy rate and robust rejection capability. These findings validate the LIMA hypothesis, showing that data quality and structural alignment are more critical than quantity for domain adaptation in low-resource settings.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| General Knowledge | MMLU | MMLU General Knowledge Accuracy74.2 | 170 | |
| Mathematical Problem Solving | GSM8K | Pass@191.6 | 8 | |
| Reasoning | MATH | Success Score (MATH)75.5 | 3 | |
| Question Answering | Düzce University Undergraduate Legislation Regulation (test 1) | Success Rate0.9 | 1 | |
| Question Answering | Düzce University Undergraduate Legislation General (test 2) | Success Rate96.7 | 1 | |
| Question Answering | Düzce University Undergraduate Legislation Challenging (test 3) | Success Rate66.7 | 1 |