Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Meta-Learning with Self-Improving Momentum Target

About

The idea of using a separately trained target model (or teacher) to improve the performance of the student model has been increasingly popular in various machine learning domains, and meta-learning is no exception; a recent discovery shows that utilizing task-wise target models can significantly boost the generalization performance. However, obtaining a target model for each task can be highly expensive, especially when the number of tasks for meta-learning is large. To tackle this issue, we propose a simple yet effective method, coined Self-improving Momentum Target (SiMT). SiMT generates the target model by adapting from the temporal ensemble of the meta-learner, i.e., the momentum network. This momentum network and its task-specific adaptations enjoy a favorable generalization performance, enabling self-improving of the meta-learner through knowledge distillation. Moreover, we found that perturbing parameters of the meta-learner, e.g., dropout, further stabilize this self-improving process by preventing fast convergence of the distillation loss during meta-training. Our experimental results demonstrate that SiMT brings a significant performance gain when combined with a wide range of meta-learning methods under various applications, including few-shot regression, few-shot classification, and meta-reinforcement learning. Code is available at https://github.com/jihoontack/SiMT.

Jihoon Tack, Jongjin Park, Hankook Lee, Jaeho Lee, Jinwoo Shin• 2022

Related benchmarks

TaskDatasetResultRank
Few-shot Image ClassificationtieredImageNet
Accuracy0.8182
90
Few-shot classificationmini-ImageNet → CUB (test)--
75
Few-shot classificationMini-ImageNet--
41
Few-shot classificationCars cross-domain from mini-ImageNet
Accuracy51.67
16
Few-shot classificationCUB cross-domain from tiered-ImageNet
Accuracy75.97
16
Few-shot classificationCars cross-domain from tiered-ImageNet
Accuracy59.01
16
Few-shot regressionShapeNet 10-shot
Angular Error16.121
6
Few-shot regressionShapeNet 15-shot
Angular Error14.377
6
Few-shot regressionPascal 10-shot
MSE1.462
6
Few-shot regressionPascal 15-shot
MSE1.229
6
Showing 10 of 10 rows

Other info

Code

Follow for update