Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

"Noisier" Noise Contrastive Eestimation is (Almost) Maximum Likelihood

About

Noise Contrastive Estimation (NCE) has fueled major breakthroughs in representation learning and generative modeling. Yet a long-standing challenge remains: accurately estimating ratios between distributions that differ substantially, which significantly limits the applicability of NCE on modern high-dimensional and multimodal datasets. We revisit this problem from a less explored perspective: the magnitude of the noise distribution. Specifically, we show that with a virtually scaled (\ie, artificially increased) noise magnitude, the gradient of the NCE objective can closely align with that of Maximum Likelihood, enabling a trajectory-wise approximation from NCE to MLE, and faster convergence both theoretically and empirically. Building on this insight, we introduce ``Noisier'' NCE, a simple drop-in modification to vanilla NCE that incurs little to no extra computational cost, while effectively handling density-ratio estimation in challenging regimes where traditional MLE and NCE struggle. Beyond improving classical density-ratio learning, ``Noisier'' NCE proves broadly applicable: it achieves strong results across image modeling, anomaly detection, and offline black-box optimization. On CIFAR-10 and ImageNet64x64 datasets, it yields 10-step and even 1-step samplers that match or surpass state-of-the-art methods, while cutting training iterations by up to half.

Peiyu Yu, Dinghuai Zhang, Hengzhi He, Xiaojian Ma, Sirui Xie, Ruiyao Miao, Yifan Lu, Yasi Zhang, Deqian Kong, Ruiqi Gao, Jianwen Xie, Guang Cheng, Ying Nian Wu• 2024

Related benchmarks

TaskDatasetResultRank
Unconditional Image GenerationCIFAR-10
FID1.45
280
Image GenerationCIFAR-10
FID77.05
212
Image GenerationCelebA
FID31.09
96
Image GenerationCelebA-HQ
FID95.66
92
Conditional Image GenerationImageNet 64x64
FID1.11
75
Image GenerationSVHN
FID25.63
37
Generative ModelingCIFAR-10
FID2.99
35
Offline Black-box OptimizationDesign-bench 100-th percentile
TFBIND8 Score98.3
20
Unsupervised Anomaly DetectionMNIST (Heldout Digit 1)
AUPRC95.9
16
Unsupervised Anomaly DetectionMNIST Heldout Digit 4
AUPRC93.5
16
Showing 10 of 17 rows

Other info

Follow for update