MoASE++: Mixture of Activation Sparsity Experts with Domain-Adaptive On-policy Distillation for Continual Test Time Adaptation
About
Continual test-time adaptation adapts a source-pretrained model to non-stationary, unlabeled target streams while retaining past competence, yet texture-biased backbones risk error accumulation and catastrophic forgetting. Drawing inspiration from the process of decoupling shape and texture in the human visual system, we introduce MoASE, a plug-in mixture-of-experts that disentangles domain-agnostic structure from domain-specific texture using Activation Sparsity Experts with Spatial Differentiable Dropout, forming complementary high- and low-activation pathways, while high- and low-rank bottlenecks diversify representations. The Activation Sparsity Gate produces input-adaptive SDD thresholds for precise token selection, and the Domain-Aware Router assigns per-sample expert weights using texture-sensitive cues. To curb confirmation bias on unlabeled streams and stabilize supervision, we then introduce Domain-Adaptive On-Policy Distillation to constitute MoASE++, with an EMA-anchored on-policy reverse KL distillation and an augmentation policy conditioned on entropy and confidence that aligns predictions across the same views and improves the robustness-plasticity balance. Extensive experiments on classification (CIFAR-10/100-C, ImageNet-C) and semantic segmentation (Cityscapes->ACDC) demonstrate consistent state-of-the-art performance, offering a principled, controllable approach to continual adaptation in dynamic visual environments.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | CIFAR-100-C | Accuracy (Corruption)25.8 | 109 | |
| Image Classification | ImageNet-C | Accuracy (Brightness)27.1 | 54 | |
| Image Classification | CIFAR10-C | Mean Accuracy (mAcc)16.8 | 41 | |
| Semantic segmentation | ACDC | mIoU62.3 | 34 | |
| Semantic segmentation | ACDC Round 3 | mIoU (Fog)73.5 | 19 | |
| Semantic segmentation | ACDC Round 2 | mIoU (Fog)73.4 | 19 | |
| Domain Generalization | ImageNet-C (10 unseen domains) | Accuracy (Motion)43.9 | 6 |