Efficient Maximal Coding Rate Reduction by Variational Forms
About
The principle of Maximal Coding Rate Reduction (MCR$^2$) has recently been proposed as a training objective for learning discriminative low-dimensional structures intrinsic to high-dimensional data to allow for more robust training than standard approaches, such as cross-entropy minimization. However, despite the advantages that have been shown for MCR$^2$ training, MCR$^2$ suffers from a significant computational cost due to the need to evaluate and differentiate a significant number of log-determinant terms that grows linearly with the number of classes. By taking advantage of variational forms of spectral functions of a matrix, we reformulate the MCR$^2$ objective to a form that can scale significantly without compromising training accuracy. Experiments in image classification demonstrate that our proposed formulation results in a significant speed up over optimizing the original MCR$^2$ objective directly and often results in higher quality learned representations. Further, our approach may be of independent interest in other models that require computation of log-determinant forms, such as in system identification or normalizing flow models.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | MNIST (test) | Accuracy97.88 | 882 | |
| Classification | CIFAR-100 (test) | Accuracy58.72 | 129 | |
| Classification | Tiny ImageNet 200 (test) | Test Accuracy26.65 | 16 | |
| Classification | CIFAR-100 (train) | Training Delta R218 | 2 | |
| Classification | MNIST (train) | Training Delta R44.2117 | 2 | |
| Classification | CIFAR-10 (train) | Training Delta R48.43 | 2 | |
| Classification | Tiny ImageNet 200 (train) | Training Delta R231.2 | 2 |