Compensating Visual Insufficiency with Stratified Language Guidance for Long-Tail Class Incremental Learning
About
Long-tail class incremental learning (LT CIL) remains highly challenging because the scarcity of samples in tail classes not only hampers their learning but also exacerbates catastrophic forgetting under continuously evolving and imbalanced data distributions. To tackle these issues, we exploit the informativeness and scalability of language knowledge. Specifically, we analyze the LT CIL data distribution to guide large language models (LLMs) in generating a stratified language tree that hierarchically organizes semantic information from coarse to fine grained granularity. Building upon this structure, we introduce stratified adaptive language guidance, which leverages learnable weights to merge multi-scale semantic representations, thereby enabling dynamic supervisory adjustment for tail classes and alleviating the impact of data imbalance. Furthermore, we introduce stratified alignment language guidance, which exploits the structural stability of the language tree to constrain optimization and reinforce semantic visual alignment, thereby alleviating catastrophic forgetting. Extensive experiments on multiple benchmarks demonstrate that our method achieves state of the art performance.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Class-incremental learning | CIFAR100 10 Tasks | Accuracy81.2 | 36 | |
| Class-incremental learning | CIFAR100 rho=0.1 (test) | Alast Accuracy77.3 | 28 | |
| Class-incremental learning | CIFAR100 rho=0.01 (test) | Alast64.3 | 28 | |
| Class-incremental learning | CUB200 (test) | Alast51.5 | 20 | |
| Class-incremental learning | ImageNetR-LT 10-task setting | A_last76.1 | 14 | |
| Long-Tail Class-Incremental Learning | ImageNet-R 20 tasks rho=0.1 | Accuracy (Last Task)73.7 | 14 | |
| Long-Tail Class-Incremental Learning | ImageNet-R 10 tasks rho=0.01 | Accuracy Last Task72.1 | 14 | |
| Long-Tail Class-Incremental Learning | ImageNet-R 20 tasks rho=0.01 | Accuracy (Last Task)70 | 14 | |
| Long-Tail Class-Incremental Learning | ImageNet-LT 20 tasks 1.0 | Accuracy68.2 | 4 | |
| Long-Tail Class-Incremental Learning | ImageNet-LT 50 tasks 1.0 | Accuracy66.7 | 4 |