Uncertainty-based Continual Learning with Adaptive Regularization

About

We introduce a new neural network-based continual learning algorithm, dubbed as Uncertainty-regularized Continual Learning (UCL), which builds on traditional Bayesian online learning framework with variational inference. We focus on two significant drawbacks of the recently proposed regularization-based methods: a) considerable additional memory cost for determining the per-weight regularization strengths and b) the absence of gracefully forgetting scheme, which can prevent performance degradation in learning new tasks. In this paper, we show UCL can solve these two problems by introducing a fresh interpretation on the Kullback-Leibler (KL) divergence term of the variational lower bound for Gaussian mean-field approximation. Based on the interpretation, we propose the notion of node-wise uncertainty, which drastically reduces the number of additional parameters for implementing per-weight regularization. Moreover, we devise two additional regularization terms that enforce stability by freezing important parameters for past tasks and allow plasticity by controlling the actively learning parameters for a new task. Through extensive experiments, we show UCL convincingly outperforms most of recent state-of-the-art baselines not only on popular supervised learning benchmarks, but also on challenging lifelong reinforcement learning tasks. The source code of our algorithm is available at https://github.com/csm9493/UCL.

Hongjoon Ahn, Sungmin Cha, Donggyu Lee, Taesup Moon• 2019

Related benchmarks

Task	Dataset	Result
Text Classification	20News	Accuracy94.65	143
Class-incremental learning	CIFAR-100 20 tasks	--	58
Task-Incremental Learning	Tiny-ImageNet 20 tasks	Average Accuracy55.2	54
Task-Incremental Learning	CIFAR-100 10 tasks	Backward Transfer-7.2	44
Document Sentiment Classification	DSC small	Accuracy80.12	40
Aspect Sentiment Classification	ASC	Accuracy84.41	40
Document Sentiment Classification	DSC full	Accuracy74.76	40
Forgetting Rate	20News FR	Accuracy4.7	34
Continual Segmentation	Med JASCL Disjoint	Total Drop (%)53.6	28
Semantic segmentation	Med JASCL-Disjoint Session 1: AMOS	Dice Score43	28

Showing 10 of 34 rows

Other info

Follow for update

@wizwand_team Discord