Mutual Information Estimation via $f$-Divergence and Data Derangements
About
Estimating mutual information accurately is pivotal across diverse applications, from machine learning to communications and biology, enabling us to gain insights into the inner mechanisms of complex systems. Yet, dealing with high-dimensional data presents a formidable challenge, due to its size and the presence of intricate relationships. Recently proposed neural methods employing variational lower bounds on the mutual information have gained prominence. However, these approaches suffer from either high bias or high variance, as the sample size and the structure of the loss function directly influence the training process. In this paper, we propose a novel class of discriminative mutual information estimators based on the variational representation of the $f$-divergence. We investigate the impact of the permutation function used to obtain the marginal training samples and present a novel architectural solution based on derangements. The proposed estimator is flexible since it exhibits an excellent bias/variance trade-off. The comparison with state-of-the-art neural estimators, through extensive experimentation within established reference scenarios, shows that our approach offers higher accuracy and lower complexity.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Mutual Information Estimation | Gaussian d=20 (test) | Bias0.01 | 60 | |
| Mutual Information Estimation | Gaussian d=5, N=64 | Variance0.04 | 42 | |
| Mutual Information Estimation | Gaussian (d=10, N=64) | Bias0.07 | 30 | |
| Mutual Information Estimation | Gaussian setting d=20, N=64 | Bias0.13 | 30 | |
| Mutual Information Estimation | High-dimensional image benchmark 32x32 (val) | Est. Mutual Information4.919 | 30 | |
| Mutual Information Estimation | Low-dimensional benchmark S=10 (test) | Mutual Information (MI)9.52 | 30 | |
| Mutual Information Estimation | Low-Dimensional Benchmark S=2 to 50, D=10 (test) | MI Estimate9.97 | 30 | |
| Mutual Information Estimation | high-dimensional image benchmark 16 x 16 | MI Estimate6.01 | 24 | |
| Mutual Information Estimation | Gaussian d=5, N=64, MI=8 | Variance0.06 | 6 | |
| Mutual Information Estimation | Gaussian (d=5, N=64, MI=10) | Variance0.06 | 6 |