Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation
About
A diffusion model learns to predict a vector field of gradients. We propose to apply chain rule on the learned gradients, and back-propagate the score of a diffusion model through the Jacobian of a differentiable renderer, which we instantiate to be a voxel radiance field. This setup aggregates 2D scores at multiple camera viewpoints into a 3D score, and repurposes a pretrained 2D model for 3D data generation. We identify a technical challenge of distribution mismatch that arises in this application, and propose a novel estimation mechanism to resolve it. We run our algorithm on several off-the-shelf diffusion image generative models, including the recently released Stable Diffusion trained on the large-scale LAION dataset.
Haochen Wang, Xiaodan Du, Jiahao Li, Raymond A. Yeh, Greg Shakhnarovich• 2022
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Text-to-3D Generation | GPTEval3D 110 prompts 1.0 | GPTEval3D Alignment1.13e+3 | 20 | |
| Text-to-3D Generation | T³Bench Single Object with Surroundings | BRISQUE82 | 14 | |
| Text-to-3D Generation | T³Bench Single Object | Alignment Score23 | 11 | |
| Text-to-3D Generation | T³Bench Multiple Objects | Quality Score17.7 | 7 | |
| Text-to-3D Generation | T3Bench (test) | Single Object Score24.7 | 7 | |
| Text-to-3D Generation | 43 prompts and 50 views (evaluation set) | CLIP Score30.39 | 6 | |
| Text-to-3D Generation | 70 user prompts | View Consistency9.58 | 2 | |
| Text-to-3D Generation | 70 prompts featuring countable faces | Success Rate29.3 | 2 |
Showing 8 of 8 rows