MoASE++: Mixture of Activation Sparsity Experts with Domain-Adaptive On-policy Distillation for Continual Test Time Adaptation

About

Continual test-time adaptation adapts a source-pretrained model to non-stationary, unlabeled target streams while retaining past competence, yet texture-biased backbones risk error accumulation and catastrophic forgetting. Drawing inspiration from the process of decoupling shape and texture in the human visual system, we introduce MoASE, a plug-in mixture-of-experts that disentangles domain-agnostic structure from domain-specific texture using Activation Sparsity Experts with Spatial Differentiable Dropout, forming complementary high- and low-activation pathways, while high- and low-rank bottlenecks diversify representations. The Activation Sparsity Gate produces input-adaptive SDD thresholds for precise token selection, and the Domain-Aware Router assigns per-sample expert weights using texture-sensitive cues. To curb confirmation bias on unlabeled streams and stabilize supervision, we then introduce Domain-Adaptive On-Policy Distillation to constitute MoASE++, with an EMA-anchored on-policy reverse KL distillation and an augmentation policy conditioned on entropy and confidence that aligns predictions across the same views and improves the robustness-plasticity balance. Extensive experiments on classification (CIFAR-10/100-C, ImageNet-C) and semantic segmentation (Cityscapes->ACDC) demonstrate consistent state-of-the-art performance, offering a principled, controllable approach to continual adaptation in dynamic visual environments.

Ronyu Zhang, Aosong Cheng, Gaole Dai, Yulin Luo, Jiaming Liu, Li Du, Huanrui Yang, Dan Wang, Leyuan Fang, Yuan Du, Shanghang Zhang• 2026

Related benchmarks

Task	Dataset	Result
Image Classification	CIFAR-100-C	Accuracy (Corruption)25.8	137
Image Classification	ImageNet-C	Accuracy (Brightness)27.1	54
Image Classification	CIFAR10-C	Mean Accuracy (mAcc)16.8	52
Semantic segmentation	ACDC	mIoU62.3	34
Semantic segmentation	ACDC Round 3	mIoU (Fog)73.5	19
Semantic segmentation	ACDC Round 2	mIoU (Fog)73.4	19
Domain Generalization	ImageNet-C (10 unseen domains)	Accuracy (Motion)43.9	6

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord