Rethinking Local Learning: A Cheaper and Faster Recipe for LLM Post-Training
About
LLM post-training typically propagates task gradients through the full depth of the model. Although this end-to-end structure is simple and general, it couples task adaptation to full-depth activation storage, long-range backward dependencies and direct task-gradient access to pretrained representations. We argue that this full-depth backward coupling can be unnecessarily expensive and intrusive, particularly when post-training supervision is much narrower than pre-training. To this end, we propose \textbf{LoPT}: Local-Learning Post-Training, a simple post-training strategy that makes gradient reach an explicit design choice. LoPT places a single gradient boundary at the transformer midpoint: the second-half block learns from the task objective, while the first-half block is updated by a lightweight feature-reconstruction objective to preserve useful representations and maintain interface compatibility. LoPT shortens the task-induced backward path while limiting direct interference from narrow task gradients on early-layer representations. Extensive experiments demonstrate that LoPT achieves competitive performance with lower memory cost, higher training efficiency and better retention of pretrained capabilities. Our code is available at: https://github.com/HumyuShi/LoPT
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Instruction Following | IFEval | -- | 836 | |
| Commonsense Reasoning | HellaSwag | HellaSwag Accuracy86.24 | 711 | |
| Mathematical Reasoning | GSM8K | -- | 204 | |
| Massive Multitask Language Understanding | MMLU | Accuracy83.34 | 129 | |
| Large Language Model Evaluation | HuggingFace Open LLM Leaderboard lm-eval-harness default (various) | HellaSwag80.97 | 36 | |
| Commonsense Reasoning | WinoGrande | Accuracy80.71 | 24 | |
| Truthfulness Evaluation | TruthfulQA | Accuracy65.75 | 20 | |
| Language Model Evaluation | lm-eval-harness (test) | MMLU74.18 | 9 | |
| Reasoning Question Answering | ARC Challenge | Accuracy74.55 | 3 |