Ditto: Building Digital Twins of Articulated Objects from Interaction
About
Digitizing physical objects into the virtual world has the potential to unlock new research and applications in embodied AI and mixed reality. This work focuses on recreating interactive digital twins of real-world articulated objects, which can be directly imported into virtual environments. We introduce Ditto to learn articulation model estimation and 3D geometry reconstruction of an articulated object through interactive perception. Given a pair of visual observations of an articulated object before and after interaction, Ditto reconstructs part-level geometry and estimates the articulation model of the object. We employ implicit neural representations for joint geometry and articulation modeling. Our experiments show that Ditto effectively builds digital twins of articulated objects in a category-agnostic way. We also apply Ditto to real-world objects and deploy the recreated digital twins in physical simulation. Code and additional results are available at https://ut-austin-rpl.github.io/Ditto
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Articulated Object Reconstruction and Motion Estimation | PARIS Real | Axis Angle Error3.8 | 6 | |
| Articulation Estimation | Shape2Motion | Prismatic Angle Error0.07 | 6 | |
| Geometry Reconstruction | Shape2Motion | Whole Chamfer Distance0.68 | 6 | |
| Articulated Object Reconstruction and Motion Estimation | PARIS Simulation | Axis Angle Error54.22 | 6 | |
| Geometry Reconstruction | Synthetic dataset | Whole Chamfer Distance0.38 | 4 | |
| Articulation Estimation | Synthetic dataset | Prismatic Angle Error0.06 | 3 | |
| Articulated motion synthesis | Synthetic dataset | Chamfer Distance0.37 | 2 |