Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SUGAR: A Scalable Human-Video-Driven Generalizable Humanoid Loco-Manipulation Learning Framework

About

Building humanoid robots capable of generalizable whole-body loco-manipulation in the real world remains a fundamental challenge. Existing methods either rely on laborious task-specific reward engineering, rigidly replay reference motions that fail to generalize, or depend on costly teleoperation that limits scalability. While human videos capture diverse human behaviors, motion priors inferred from them are inherently imperfect, suffering from occlusion, contact artifacts, and retargeting errors that render them unsuitable for direct policy learning. To address this, we present SUGAR, a scalable data-driven framework that converts diverse human videos into deployable humanoid loco-manipulation skills, without any task-specific reward engineering or reference-motion conditioning at inference. SUGAR proceeds in three stages. First, a fully automated pipeline extracts kinematic interaction priors including human-object motion trajectories and contact labels from unstructured human videos. Second, a privileged physics-based refiner uses a unified mimic reward and progressive state pool to transform imperfect priors into physically feasible, high-fidelity skills. Third, refined skills are distilled into a hierarchical autonomous policy consisting of a command generator and a command tracker. We evaluate SUGAR on six representative loco-manipulation tasks in simulation and real-world humanoid hardware. Our method substantially outperforms reference-tracking baselines, and performance scales clearly with the amount of human video data. It also achieves zero-shot real-world transfer with reliable closed-loop execution, autonomous failure recovery, and stable long-horizon performance under external perturbations. Project Page: https://tianshuwu.github.io/sugar-humanoid/

Tianshu Wu, Xiangqi Kong, Yue Chen, Qize Yu, Hang Ye, Jia Li, Yizhou Wang, Hao Dong• 2026

Related benchmarks

TaskDatasetResultRank
Carry BoxLoco-manipulation Simulation (train)
Success Rate (SR)84.5
8
Carry BoxLoco-manipulation Simulation (test)
Success Rate (SR)69.6
8
Kick BoxLoco-manipulation Simulation (train)
Success Rate (SR)89.5
8
Kick BoxLoco-manipulation Simulation (test)
Success Rate (SR)76.3
8
Pick BottleLoco-manipulation Simulation (train)
Success Rate98.8
8
Pick BottleLoco-manipulation Simulation (test)
Success Rate99.2
8
Push BoxLoco-manipulation Simulation (test)
Success Rate73
8
Sit ChairLoco-manipulation Simulation (train)
Success Rate96.7
8
Sit ChairLoco-manipulation Simulation (test)
SR99.6
8
Stand BottleLoco-manipulation Simulation (train)
Success Rate91.9
8
Showing 10 of 18 rows

Other info

Follow for update