Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ChunkFT: Byte-Streamed Optimization for Memory-Efficient Full Fine-Tuning

About

This work presents \textsc{ChunkFT}, a memory-efficient fine-tuning framework that reformulates full-parameter fine-tuning around a dynamically activated working set. \textsc{ChunkFT} enables gradient computation for arbitrary sub-tensors without modifying the network architecture, providing an algorithmic foundation for optimizing arbitrary sub-networks while avoiding standard dense gradient computation. We provide a theoretical convergence analysis of \textsc{ChunkFT} in the deterministic setting. Empirically, we apply \textsc{ChunkFT} to fine-tune Llama 3-8B and Llama 3-70B using a single RTX 4090-24GB GPU and 2$\times$ H800-80GB GPUs, respectively. Full-parameter fine-tuning of a 7B model with a 1K input length requires only 13.72GB of GPU memory. The results demonstrate the effectiveness of \textsc{ChunkFT} in memory usage, running time, and optimization quality. Moreover, downstream evaluations on language understanding, mathematical reasoning, and MT-Bench show that \textsc{ChunkFT} consistently outperforms existing memory-efficient baselines. Notably, \textsc{ChunkFT} achieves performance comparable to, and in some cases exceeding, full-parameter fine-tuning. Our repository is on https://github.com/misonsky/chunk.

Yongkang Liu, Zijing Wang, Mengjie Zhao, Ercong Nie, Mingyang Wang, Qian Li, Feiliang Ren, Shi Feng, Daling Wang, Hinrich Sch\"utze• 2026

Related benchmarks

TaskDatasetResultRank
Instruction FollowingMT-Bench
MT-Bench Score6.8
287
Mathematical ReasoningAQUA
Accuracy45.6
167
Natural Language UnderstandingSuperGLUE (test)
BoolQ Accuracy85.7
74
Mathematical ReasoningNUMGLUE
Accuracy56.4
39
Mathematical ReasoningMMLU Math
Score51.4
9
Mathematical ReasoningSAT Math
SAT Math Score57.8
9
Natural Language UnderstandingSuperGLUE
BoolQ Accuracy88.5
6
Mathematical ReasoningMath Benchmarks evaluated on Llama 3-70B
GSM8K Accuracy77.9
5
Showing 8 of 8 rows

Other info

Follow for update