Skullptor: High Fidelity 3D Head Reconstruction in Seconds with Multi-View Normal Prediction

About

Reconstructing high-fidelity 3D head geometry from images is critical for a wide range of applications, yet existing methods face fundamental limitations. Traditional photogrammetry achieves exceptional detail but requires extensive camera arrays (25-200+ views), substantial computation, and manual cleanup in challenging areas like facial hair. Recent alternatives present a fundamental trade-off: foundation models enable efficient single-image reconstruction but lack fine geometric detail, while optimization-based methods achieve higher fidelity but require dense views and expensive computation. We bridge this gap with a hybrid approach that combines the strengths of both paradigms. Our method introduces a multi-view surface normal prediction model that extends monocular foundation models with cross-view attention to produce geometrically consistent normals in a feed-forward pass. We then leverage these predictions as strong geometric priors within an inverse rendering optimization framework to recover high-frequency surface details. Our approach outperforms state-of-the-art single-image and multi-view methods, achieving high-fidelity reconstruction on par with dense-view photogrammetry while reducing camera requirements and computational cost.

No\'e Artru, Rukhshanda Hussain, Emeline Got, Alexandre Messier, David B. Lindell, Abdallah Dib• 2026

Related benchmarks

Task	Dataset	Result
Mesh Reconstruction	NPHM	Depth Error (mm)2.33	5
Mesh Reconstruction	MultiFace	Depth Error (mm)2.43	5
Normal estimation	MultiFace	Avg Angular Error9.13	4
Normal estimation	NPHM	Avg Angular Error7.29	4

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord