Amortized Vine Copulas for High-Dimensional Density and Information Estimation
About
Modeling high-dimensional dependencies while keeping likelihoods tractable remains challenging. Classical vine-copula pipelines are interpretable but can be expensive, while many neural estimators are flexible but less structured. In this work, we propose Vine Denoising Copula (VDC), an amortized vine-copula pipeline for continuous-data, simplified-vine dependence modeling. VDC trains a single bivariate denoising model and reuses it across all vine edges. For each edge, given pseudo-observations, the model predicts a piecewise-constant density grid. We then apply an IPFP/Sinkhorn projection that normalizes mass and drives the marginals to uniformity. This preserves the tractable vine-likelihood structure and the usual copula interpretation while replacing repeated per-edge optimization with GPU inference. Across synthetic and real-data benchmarks, VDC delivers strong bivariate density accuracy, competitive MI/TC estimation, and faster high-dimensional vine fitting. These gains make explicit information estimation and dependence decomposition feasible when repeated vine fitting would otherwise be costly, while conditional downstream tasks remain a limitation.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Density Estimation | HEPMASS d=21; N=525,123 (test) | -- | 8 | |
| Bivariate Mutual Information Estimation | Synthetic copulas with analytic MI | MAE (nats)0.011 | 7 | |
| Bivariate copula estimation | Synthetic copulas (held-out suite) | ISE5.13e-7 | 5 | |
| Bivariate density estimation and conditional-transform accuracy | Complex-copula synthetic bivariate families (ring, double-banana, etc.) | ISE9.93e-7 | 5 | |
| Density Estimation | Gas d=8 (test) | NLL-0.002 | 5 | |
| Density Estimation | Miniboone d=50 (test) | NLL-0.053 | 5 | |
| Density Estimation | Power d=5 (test) | NLL-0.679 | 5 | |
| Density Estimation | Credit (d=24) (test) | NLL0.001 | 5 | |
| Joint density and information metric estimation | Controlled synthetic settings (stress-test) | NLL (bits/dim)-0.354 | 4 | |
| Missing Data Imputation | Power 20% MCAR | RMSE6.893 | 2 |