Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Differentiable Robot Rendering

About

Vision foundation models trained on massive amounts of visual data have shown unprecedented reasoning and planning skills in open-world settings. A key challenge in applying them to robotic tasks is the modality gap between visual data and action data. We introduce differentiable robot rendering, a method allowing the visual appearance of a robot body to be directly differentiable with respect to its control parameters. Our model integrates a kinematics-aware deformable model and Gaussians Splatting and is compatible with any robot form factors and degrees of freedom. We demonstrate its capability and usage in applications including reconstruction of robot poses from images and controlling robots through vision language models. Quantitative and qualitative results show that our differentiable rendering model provides effective gradients for robotic control directly from pixels, setting the foundation for the future applications of vision foundation models in robotics.

Ruoshi Liu, Alper Canberk, Shuran Song, Carl Vondrick• 2024

Related benchmarks

TaskDatasetResultRank
Visual-goal pose reconstructionFranka Robot Environment
Success Rate92
12
Visual-goal pose reconstructionFetch Robot Environment
Success Rate (%)84
12
Visual-goal pose reconstructionUR5e Robot Environment
Success Rate80
12
Articulated Object ReconstructionRobot 2 Arm Airbot Play
mIoU57.43
4
Pose ReconstructionPanda-3CAM-Azure
Joint 1 Error (J1 Error)0.077
4
Visual-goal motion planningFetch 2.5 rad bin 1.0 (test)
Success Rate (SR)1
4
Articulated Object ReconstructionFurniture 21 IKEA Cabinet
IoU57.36
4
Articulated Object ReconstructionRobot 1 Hand Xhand
IoU28.53
4
Articulated Object ReconstructionFurniture IKEA Cabinet 09
IoU35.84
4
Visual-goal motion planningFranka 0.5 rad bin 1.0 (test)
Success Rate (SR)59.8
4
Showing 10 of 26 rows

Other info

Follow for update