Meta-Curvature
About
We propose meta-curvature (MC), a framework to learn curvature information for better generalization and fast model adaptation. MC expands on the model-agnostic meta-learner (MAML) by learning to transform the gradients in the inner optimization such that the transformed gradients achieve better generalization performance to a new task. For training large scale neural networks, we decompose the curvature matrix into smaller matrices in a novel scheme where we capture the dependencies of the model's parameters with a series of tensor products. We demonstrate the effects of our proposed method on several few-shot learning tasks and datasets. Without any task specific techniques and architectures, the proposed method achieves substantial improvement upon previous MAML variants and outperforms the recent state-of-the-art methods. Furthermore, we observe faster convergence rates of the meta-training process. Finally, we present an analysis that explains better generalization performance with the meta-trained curvature.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Few-shot classification | tieredImageNet (test) | Accuracy82.61 | 282 | |
| Few-shot Image Classification | Mini-Imagenet (test) | Accuracy80.21 | 235 | |
| Few-shot Image Classification | miniImageNet (test) | -- | 111 | |
| 5-way Few-shot Classification | miniImageNet standard (test) | Accuracy68.47 | 91 | |
| Few-shot Image Classification | FC100 (test) | Accuracy49.12 | 69 | |
| 5-way 5-shot Classification | Omniglot (test) | Accuracy99.89 | 49 | |
| Few-shot classification | Omniglot 20-way 1-shot (test) | Accuracy99.12 | 43 | |
| Few-shot classification | Omniglot 20-way 5-shot (test) | Accuracy99.65 | 43 | |
| 5-way 1-shot Classification | Omniglot (test) | -- | 34 | |
| Few-shot Image Classification | CIFAR FS (test) | Worst Accuracy16.95 | 18 |