Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

LRM: Large Reconstruction Model for Single Image to 3D

About

We propose the first Large Reconstruction Model (LRM) that predicts the 3D model of an object from a single input image within just 5 seconds. In contrast to many previous methods that are trained on small-scale datasets such as ShapeNet in a category-specific fashion, LRM adopts a highly scalable transformer-based architecture with 500 million learnable parameters to directly predict a neural radiance field (NeRF) from the input image. We train our model in an end-to-end manner on massive multi-view data containing around 1 million objects, including both synthetic renderings from Objaverse and real captures from MVImgNet. This combination of a high-capacity model and large-scale training data empowers our model to be highly generalizable and produce high-quality 3D reconstructions from various testing inputs, including real-world in-the-wild captures and images created by generative models. Video demos and interactable 3D meshes can be found on our LRM project webpage: https://yiconghong.me/LRM.

Yicong Hong, Kai Zhang, Jiuxiang Gu, Sai Bi, Yang Zhou, Difan Liu, Feng Liu, Kalyan Sunkavalli, Trung Bui, Hao Tan• 2023

Related benchmarks

TaskDatasetResultRank
3D Shape ReconstructionOmniObject3D
CD0.407
17
Image-conditioned 3D GenerationObjaverse (test)
FID38.41
10
3D Shape ReconstructionPix3D
FS@10.1458
10
Image-to-3D GenerationUser Study (test)
Multi-view Consistency6.72
8
Image-to-3D Mesh GenerationGSO (test)
PSNR18.0433
8
3D Shape ReconstructionOcrtoc3D (test)
FS@10.1552
7
Single Image to 3D ReconstructionGoogle Scanned Objects (GSO) orbiting views
Chamfer Distance0.1479
7
3D ReconstructionOmniObject3D
PSNR18.04
7
Single Image to 3D ReconstructionGoogle Scanned Objects (GSO) orbiting views
PSNR16.728
7
Collision-free path planningplants
Path Length3.2
6
Showing 10 of 17 rows

Other info

Follow for update