Nonlinear Information Bottleneck
About
Information bottleneck (IB) is a technique for extracting information in one random variable $X$ that is relevant for predicting another random variable $Y$. IB works by encoding $X$ in a compressed "bottleneck" random variable $M$ from which $Y$ can be accurately decoded. However, finding the optimal bottleneck variable involves a difficult optimization problem, which until recently has been considered for only two limited cases: discrete $X$ and $Y$ with small state spaces, and continuous $X$ and $Y$ with a Gaussian joint distribution (in which case optimal encoding and decoding maps are linear). We propose a method for performing IB on arbitrarily-distributed discrete and/or continuous $X$ and $Y$, while allowing for nonlinear encoding and decoding maps. Our approach relies on a novel non-parametric upper bound for mutual information. We describe how to implement our method using neural networks. We then show that it achieves better performance than the recently-proposed "variational IB" method on several real-world datasets.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | CINIC-10 (test) | -- | 177 | |
| Image Classification | CIFAR-10-C (test) | -- | 61 | |
| Image Classification | CIFAR-100 Sym-20% (test) | Accuracy55.99 | 33 | |
| Image Classification | CIFAR-100 Sym-50% (test) | Accuracy46.2 | 32 | |
| Image Classification | ANIMAL-10N | Accuracy0.7562 | 32 | |
| Image Classification | CIFAR-10 40% asymmetric noise | Accuracy78.16 | 27 | |
| Image Classification | CIFAR-10.1 (test) | Test Error14.6 | 13 | |
| Image Classification | CIFAR-100-N | -- | 11 | |
| Image Classification | CIFAR-10 (50%(sym)) | Accuracy76.16 | 10 | |
| Classification | CIFAR-10N | Aggregate Accuracy85.21 | 10 |