Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Ambient Diffusion: Learning Clean Distributions from Corrupted Data

About

We present the first diffusion-based framework that can learn an unknown distribution using only highly-corrupted samples. This problem arises in scientific applications where access to uncorrupted samples is impossible or expensive to acquire. Another benefit of our approach is the ability to train generative models that are less likely to memorize individual training samples since they never observe clean training data. Our main idea is to introduce additional measurement distortion during the diffusion process and require the model to predict the original corrupted image from the further corrupted image. We prove that our method leads to models that learn the conditional expectation of the full uncorrupted image given this additional measurement corruption. This holds for any corruption process that satisfies some technical conditions (and in particular includes inpainting and compressed sensing). We train models on standard benchmarks (CelebA, CIFAR-10 and AFHQ) and show that we can learn the distribution even when all the training samples have $90\%$ of their pixels missing. We also show that we can finetune foundation models on small corrupted datasets (e.g. MRI scans with block corruptions) and learn the clean distribution without memorizing the training set.

Giannis Daras, Kulin Shah, Yuval Dagan, Aravind Gollakota, Alexandros G. Dimakis, Adam Klivans• 2023

Related benchmarks

TaskDatasetResultRank
Image DeblurringCelebA (test)
PSNR21.16
25
InpaintingAFHQ
LPIPS0.0861
15
Image DenoisingCIFAR-10 (test)
PSNR21.37
13
DenoisingCIFAR-10 32x32
FID114.1
13
Physical Dynamics ImputationGlobal Ocean SSS
MSE0.286
12
Physical Dynamics ImputationBlack Sea CHL
MSE0.228
12
Physical Dynamics ImputationBaltic Sea NANO
MSE0.365
12
Robotic ManipulationTwoArm-Lift Gaussian Noise
Success Rate (SR)90.7
8
Robotic ManipulationPush-T Directional Noise
Success Rate (SR)73.7
8
Robotic Manipulation SimulationTwoArm-Lift simulation (partial feedback)
Success Rate99
8
Showing 10 of 51 rows

Other info

Code

Follow for update