Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

$PC^2$: Projection-Conditioned Point Cloud Diffusion for Single-Image 3D Reconstruction

About

Reconstructing the 3D shape of an object from a single RGB image is a long-standing and highly challenging problem in computer vision. In this paper, we propose a novel method for single-image 3D reconstruction which generates a sparse point cloud via a conditional denoising diffusion process. Our method takes as input a single RGB image along with its camera pose and gradually denoises a set of 3D points, whose positions are initially sampled randomly from a three-dimensional Gaussian distribution, into the shape of an object. The key to our method is a geometrically-consistent conditioning process which we call projection conditioning: at each step in the diffusion process, we project local image features onto the partially-denoised point cloud from the given camera pose. This projection conditioning process enables us to generate high-resolution sparse geometries that are well-aligned with the input image, and can additionally be used to predict point colors after shape reconstruction. Moreover, due to the probabilistic nature of the diffusion process, our method is naturally capable of generating multiple different shapes consistent with a single input image. In contrast to prior work, our approach not only performs well on synthetic benchmarks, but also gives large qualitative improvements on complex real-world data.

Luke Melas-Kyriazi, Christian Rupprecht, Andrea Vedaldi• 2023

Related benchmarks

TaskDatasetResultRank
3D ReconstructionShapeNet (test)--
74
Single-view 3D ReconstructionShapeNet-R2N2 (test)
mIoU34.3
22
2D-to-3D ReconstructionShapeNet 1 (test)
Chamfer Distance5.39
18
3D Shape ReconstructionPix3D chair
CD115.9
14
3D Object ReconstructionCO3D 10 held-out categories v2
Accuracy34.2
6
3D Human-Object Shape ReconstructionInterCap unseen objects
Combined F-score@0.01m38.43
6
3D Object ReconstructionPix3D Sofa
CD47.17
6
3D Object ReconstructionPix3D Table
CD202.8
6
3D ReconstructionBEHAVE (test)
Combined F-score @0.01m42.31
4
3D ReconstructionInterCap
Combined F-score (0.01m)50.57
4
Showing 10 of 10 rows

Other info

Code

Follow for update