MotionAnymesh: Physics-Grounded Articulation for Simulation-Ready Digital Twins

About

Converting static 3D meshes into interactable articulated assets is crucial for embodied AI and robotic simulation. However, existing zero-shot pipelines struggle with complex assets due to a critical lack of physical grounding. Specifically, ungrounded Vision-Language Models (VLMs) frequently suffer from kinematic hallucinations, while unconstrained joint estimation inevitably leads to catastrophic mesh inter-penetration during physical simulation. To bridge this gap, we propose MotionAnymesh, an automated zero-shot framework that seamlessly transforms unstructured static meshes into simulation-ready digital twins. Our method features a kinematic-aware part segmentation module that grounds VLM reasoning with explicit SP4D physical priors, effectively eradicating kinematic hallucinations. Furthermore, we introduce a geometry-physics joint estimation pipeline that combines robust type-aware initialization with physics-constrained trajectory optimization to rigorously guarantee collision-free articulation. Extensive experiments demonstrate that MotionAnymesh significantly outperforms state-of-the-art baselines in both geometric precision and dynamic physical executability, providing highly reliable assets for downstream applications.

WenBo Xu, Liu Liu, Li Zhang, Dan Guo, RuoNan Liu• 2026

Related benchmarks

Task	Dataset	Result
Joint parameter estimation	Diverse 3D Assets (test)	Type Error0.08	6
Part Segmentation	Diverse 3D Assets (test)	mIoU0.86	6
Physical Executability	Diverse 3D Assets (test)	Executability87	6

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord