Learning to Perturb Hidden Representations for Generalizable Deep Learning

About

Deep neural networks process data through a cascade of representations: input features, hidden activations, logits, and loss. While perturbations at the input, logit, and label levels have been systematically studied, the intermediate hidden activations, which constitute the bulk of the network's computation, have received no unified perturbation analysis. In this paper, we establish a unified framework for hidden activation perturbation, revealing that Dropout, Manifold Mixup, adversarial feature perturbation, and related methods all impose specific forms of activation perturbation but with class-agnostic or random strategies. We conjecture that expansive perturbation (increasing activation norm) acts as positive augmentation, while contractive perturbation (decreasing activation norm) acts as negative augmentation, and that the perturbation layer determines whether the effect resembles input-level augmentation (shallow layers) or logit-level manipulation (deep layers). We propose Learning to Perturb Activations (LPA), which adaptively perturbs activations at a selected hidden layer with class-level perturbations learned via PGD. We further provide theoretical analysis connecting activation perturbation to flat minima and perturbation amplification through layers. Experiments on balanced classification, long-tail classification, and domain generalization demonstrate that LPA consistently outperforms existing methods and provides complementary benefits to logit perturbation methods such as LPL.

Hua Li• 2026

Related benchmarks

Task	Dataset	Result
Domain Generalization	DomainBed (test)	VLCS Accuracy79.1	118
Image Classification	CIFAR-100-LT-100 balanced (test)	Top-1 Error Rate16.48	22
Image Classification	CIFAR-10 balanced setting (test)	Top-1 Error Rate3.06	22

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord