Physics-guided Shape-from-Template: Monocular Video Perception through Neural Surrogate Models
About
3D reconstruction of dynamic scenes is a long-standing problem in computer graphics and increasingly difficult the less information is available. Shape-from-Template (SfT) methods aim to reconstruct a template-based geometry from RGB images or video sequences, often leveraging just a single monocular camera without depth information, such as regular smartphone recordings. Unfortunately, existing reconstruction methods are either unphysical and noisy or slow in optimization. To solve this problem, we propose a novel SfT reconstruction algorithm for cloth using a pre-trained neural surrogate model that is fast to evaluate, stable, and produces smooth reconstructions due to a regularizing physics simulation. Differentiable rendering of the simulated mesh enables pixel-wise comparisons between the reconstruction and a target video sequence that can be used for a gradient-based optimization procedure to extract not only shape information but also physical parameters such as stretching, shearing, or bending stiffness of the cloth. This allows to retain a precise, stable, and smooth reconstructed geometry while reducing the runtime by a factor of 400-500 compared to $\phi$-SfT, a state-of-the-art physics-based SfT approach.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Cloth Simulation | Standard motion sequences (test) | Chamfer Distance (eCD)0.0107 | 119 | |
| Cloth Simulation | Motion Sequences (rot_h1) | e3D23.8 | 8 | |
| Cloth Simulation | Motion Sequences (Average) | e3D Score22.4 | 7 | |
| Cloth Simulation | Motion Sequences xy v2 | e3D24.1 | 7 | |
| 3D Mesh Reconstruction | ϕ-SfT synthetic (S1) | Average Mesh Error0.0298 | 4 | |
| Shape-from-Template | phi-SfT synthetic dataset (S1) | e3D Error0.0298 | 4 | |
| 3D Mesh Reconstruction | ϕ-SfT synthetic (S2) | Average Mesh Error0.042 | 4 | |
| Shape-from-Template | phi-SfT synthetic dataset (S2) | e3D0.0448 | 4 | |
| 3D Mesh Reconstruction | ϕ-SfT S3 synthetic | Average Mesh Error0.0823 | 4 | |
| 3D Mesh Reconstruction | ϕ-SfT synthetic (S4) | Average Mesh Error0.0919 | 4 |