Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation

About

A diffusion model learns to predict a vector field of gradients. We propose to apply chain rule on the learned gradients, and back-propagate the score of a diffusion model through the Jacobian of a differentiable renderer, which we instantiate to be a voxel radiance field. This setup aggregates 2D scores at multiple camera viewpoints into a 3D score, and repurposes a pretrained 2D model for 3D data generation. We identify a technical challenge of distribution mismatch that arises in this application, and propose a novel estimation mechanism to resolve it. We run our algorithm on several off-the-shelf diffusion image generative models, including the recently released Stable Diffusion trained on the large-scale LAION dataset.

Haochen Wang, Xiaodan Du, Jiahao Li, Raymond A. Yeh, Greg Shakhnarovich• 2022

Related benchmarks

Task	Dataset	Result
Text-to-3D Generation	GPTEval3D 110 prompts 1.0	GPTEval3D Alignment1.13e+3	20
Text-to-3D Generation	T³Bench Multiple Objects	Quality Score17.7	16
Text-to-3D Generation	MATE-3D	HyperScore Alignment4.02	15
Text-to-3D Generation	T³Bench Single Object with Surroundings	BRISQUE82	14
Text-to-3D Generation	T3Bench (test)	Single Object Score24.7	14
Text-to-3D Generation	T³Bench Single Object	Alignment Score23	11
Text-to-3D Generation	T3Bench frozen (300-prompt audit set)	CLIP Score20.78	10
Measured orbit coverage	T3Bench 300-prompt derived (frozen audit set)	Metric A (Coverage Count/Score)357	9
Text-to-3D Generation	ComboVerse (test)	CLIP Score0.303	8
Text-to-3D Generation	Text-to-3D 43-prompt set	CLIP Score30.39	8

Showing 10 of 13 rows

Other info

Follow for update

@wizwand_team Discord