Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Articulated Object Estimation in the Wild

About

Understanding the 3D motion of articulated objects is essential in robotic scene understanding, mobile manipulation, and motion planning. Prior methods for articulation estimation have primarily focused on controlled settings, assuming either fixed camera viewpoints or direct observations of various object states, which tend to fail in more realistic unconstrained environments. In contrast, humans effortlessly infer articulation by watching others manipulate objects. Inspired by this, we introduce ArtiPoint, a novel estimation framework that can infer articulated object models under dynamic camera motion and partial observability. By combining deep point tracking with a factor graph optimization framework, ArtiPoint robustly estimates articulated part trajectories and articulation axes directly from raw RGB-D videos. To foster future research in this domain, we introduce Arti4D, the first ego-centric in-the-wild dataset that captures articulated object interactions at a scene level, accompanied by articulation labels and ground-truth camera poses. We benchmark ArtiPoint against a range of classical and learning-based baselines, demonstrating its superior performance on Arti4D. We make code and Arti4D publicly available at https://artipoint.cs.uni-freiburg.de.

Abdelrhman Werby, Martin B\"uchner, Adrian R\"ofer, Chenguang Huang, Wolfram Burgard, Abhinav Valada• 2025

Related benchmarks

TaskDatasetResultRank
Articulated Object EstimationArti4D 1.0 (test)
Prismatic Theta Error [deg]17.754
8
Temporal Interaction SegmentationArti4D-Semantic (ego-centric sequences)
1D IoU57.5
4
Articulation EstimationDROID 19 articulated object manipulation demos
Prismatic Joint Angle Error (deg)35.88
2
Showing 3 of 3 rows

Other info

Follow for update