Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

CAT3D: Create Anything in 3D with Multi-View Diffusion Models

About

Advances in 3D reconstruction have enabled high-quality 3D capture, but require a user to collect hundreds to thousands of images to create a 3D scene. We present CAT3D, a method for creating anything in 3D by simulating this real-world capture process with a multi-view diffusion model. Given any number of input images and a set of target novel viewpoints, our model generates highly consistent novel views of a scene. These generated views can be used as input to robust 3D reconstruction techniques to produce 3D representations that can be rendered from any viewpoint in real-time. CAT3D can create entire 3D scenes in as little as one minute, and outperforms existing methods for single image and few-view 3D scene creation. See our project page for results and interactive demos at https://cat3d.github.io .

Ruiqi Gao, Aleksander Holynski, Philipp Henzler, Arthur Brussee, Ricardo Martin-Brualla, Pratul Srinivasan, Jonathan T. Barron, Ben Poole• 2024

Related benchmarks

TaskDatasetResultRank
Novel View SynthesisLLFF
PSNR25.63
124
Novel View SynthesisRealEstate10K
PSNR32.2
116
Novel View SynthesisMip-NeRF360
PSNR18.67
104
Novel View SynthesisDTU
PSNR25.92
100
Novel View SynthesisTanks&Temples
PSNR12.525
52
Novel View SynthesisCO3D
PSNR23.58
24
Few-view 3D ReconstructionRealEstate10K (test)
PSNR32.2
20
Few-view 3D ReconstructionLLFF (out-of-distribution)
PSNR25.63
12
Few-view 3D ReconstructionDTU (out-of-distribution)
PSNR25.92
12
Few-view 3D ReconstructionCo3D (test)
PSNR23.58
12
Showing 10 of 22 rows

Other info

Code

Follow for update