Batch Loss Score for Dynamic Data Pruning

About

Dynamic data pruning accelerates deep learning by selectively omitting less informative samples during training. While per-sample loss is a common importance metric, obtaining it can be challenging or infeasible for complex models or loss functions, often requiring significant implementation effort. This work proposes the Batch Loss Score (BLS), a computationally efficient alternative using an Exponential Moving Average (EMA) of readily available batch losses to assign scores to individual samples. We frame the batch loss, from the perspective of a single sample, as a noisy measurement of its scaled individual loss, with noise originating from stochastic batch composition. It is formally shown that the EMA mechanism functions as a first-order low-pass filter, attenuating high-frequency batch composition noise. This yields a score approximating the smoothed and persistent contribution of the individual sample to the loss, providing a theoretical grounding for BLS as a proxy for sample importance. BLS demonstrates remarkable code integration simplicity (\textbf{three-line injection}) and readily adapts existing per-sample loss-based methods (\textbf{one-line proxy}). Its effectiveness is demonstrated by enhancing two such methods to losslessly prune \textbf{20\%-50\%} of samples across \textit{14 datasets}, \textit{11 tasks} and \textit{18 models}, highlighting its utility and broad applicability, especially for complex scenarios where per-sample loss is difficult to access. Code is available at https://github.com/mrazhou/BLS.

Qing Zhou, Bingxuan Zhao, Tao Yang, Hongyuan Zhang, Junyu Gao, Qi Wang• 2026

Related benchmarks

Task	Dataset	Result
Object Detection	COCO 2017 (val)	AP22.2	2843
Instance Segmentation	COCO 2017 (val)	--	1275
Image Classification	CIFAR100	Accuracy78.5	301
Image Classification	CIFAR10	Accuracy95.6	143
Image Captioning	NoCaps 1.0 (val)	Overall Score65.3	32
Image Classification	ImageNet-1K	Accuracy80	18
Image Classification	CIFAR100 (test)	Top-1 Accuracy58	10
Image Classification	CIFAR100	Accuracy80.7	6
Image Captioning	COCO	BLEU@427.2	3
Multi-view Stereo	WHU-MVS	Accuracy (<3 units)95.17	3

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord