"Noisier" Noise Contrastive Eestimation is (Almost) Maximum Likelihood
About
Noise Contrastive Estimation (NCE) has fueled major breakthroughs in representation learning and generative modeling. Yet a long-standing challenge remains: accurately estimating ratios between distributions that differ substantially, which significantly limits the applicability of NCE on modern high-dimensional and multimodal datasets. We revisit this problem from a less explored perspective: the magnitude of the noise distribution. Specifically, we show that with a virtually scaled (\ie, artificially increased) noise magnitude, the gradient of the NCE objective can closely align with that of Maximum Likelihood, enabling a trajectory-wise approximation from NCE to MLE, and faster convergence both theoretically and empirically. Building on this insight, we introduce ``Noisier'' NCE, a simple drop-in modification to vanilla NCE that incurs little to no extra computational cost, while effectively handling density-ratio estimation in challenging regimes where traditional MLE and NCE struggle. Beyond improving classical density-ratio learning, ``Noisier'' NCE proves broadly applicable: it achieves strong results across image modeling, anomaly detection, and offline black-box optimization. On CIFAR-10 and ImageNet64x64 datasets, it yields 10-step and even 1-step samplers that match or surpass state-of-the-art methods, while cutting training iterations by up to half.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Unconditional Image Generation | CIFAR-10 | FID1.45 | 280 | |
| Image Generation | CIFAR-10 | FID77.05 | 212 | |
| Image Generation | CelebA | FID31.09 | 96 | |
| Image Generation | CelebA-HQ | FID95.66 | 92 | |
| Conditional Image Generation | ImageNet 64x64 | FID1.11 | 75 | |
| Image Generation | SVHN | FID25.63 | 37 | |
| Generative Modeling | CIFAR-10 | FID2.99 | 35 | |
| Offline Black-box Optimization | Design-bench 100-th percentile | TFBIND8 Score98.3 | 20 | |
| Unsupervised Anomaly Detection | MNIST (Heldout Digit 1) | AUPRC95.9 | 16 | |
| Unsupervised Anomaly Detection | MNIST Heldout Digit 4 | AUPRC93.5 | 16 |