Re-evaluating Continual Learning Scenarios: A Categorization and Case for Strong Baselines
About
Continual learning has received a great deal of attention recently with several approaches being proposed. However, evaluations involve a diverse set of scenarios making meaningful comparison difficult. This work provides a systematic categorization of the scenarios and evaluates them within a consistent framework including strong baselines and state-of-the-art methods. The results provide an understanding of the relative difficulty of the scenarios and that simple baselines (Adagrad, L2 regularization, and naive rehearsal strategies) can surprisingly achieve similar performance to current mainstream methods. We conclude with several suggestions for creating harder evaluation scenarios and future research directions. The code is available at https://github.com/GT-RIPL/Continual-Learning-Benchmark
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Keyword Spotting | Google Speech Commands (test) | Accuracy56 | 61 | |
| Incremental Task Learning (ITL) | split-MNIST (test) | Retained Accuracy99.41 | 32 | |
| Incremental Task Learning (ITL) | Permuted MNIST (test) | Retained Accuracy97.24 | 32 | |
| Incremental Task Learning | split-CIFAR100 (test) | Retained Accuracy78.41 | 24 | |
| Incremental Domain Learning (IDL) | split-MNIST (test) | Retained Accuracy97.13 | 16 | |
| Incremental Domain Learning (IDL) | Permuted MNIST (test) | Retained Accuracy96.75 | 16 | |
| Incremental Domain Learning | CIFAR100 split (test) | Retained Accuracy51.81 | 12 | |
| Environmental Sound Classification | DCASE Task 1 2019 (incremental split (5 tasks)) | Accuracy47.3 | 6 | |
| Environmental Sound Classification | ESC-50 (incremental split (5 tasks)) | Accuracy22.5 | 6 |