Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DeCo: Task Decomposition and Skill Composition for Zero-Shot Generalization in Long-Horizon 3D Manipulation

About

Generalizing language-conditioned multi-task imitation learning (IL) models to novel long-horizon 3D manipulation tasks is challenging. To address this, we propose DeCo (Task Decomposition and Skill Composition), a model-agnostic framework that enhances zero-shot generalization to compositional long-horizon manipulation tasks. DeCo decomposes IL demonstrations into modular atomic tasks based on gripper-object interactions, creating a dataset that enables models to learn reusable skills. At inference, DeCo uses a vision-language model (VLM) to parse high-level instructions, retrieve relevant skills, and dynamically schedule their execution. A spatially-aware skill-chaining module ensures smooth, collision-free transitions between skills. We introduce DeCoBench, a benchmark designed to evaluate compositional generalization in long-horizon manipulation tasks. DeCo improves the success rate of three IL models, RVT-2, 3DDA, and ARP, by 66.67%, 21.53%, and 57.92%, respectively, on 12 novel tasks. In real-world experiments, the DeCo-enhanced model, trained on only 6 atomic tasks, completes 9 novel tasks in zero-shot, with a 53.33% improvement over the baseline model. Project website: https://deco226.github.io.

Zixuan Chen, Junhui Yin, Yangtao Chen, Jing Huo, Pinzhuo Tian, Jieqi Shi, Yiwen Hou, Yinchuan Li, Yang Gao• 2025

Related benchmarks

TaskDatasetResultRank
Long-horizon Robotic Task ExecutionDeCoBench Novel long-horizon tasks
Average Success Rate6.67e+3
6
Robotic PlanningTidy_House
Success Rate46.67
6
Robotic PlanningPrepare_Groceries
Success Rate35
6
Robotic PlanningSet Table
Success Rate23.67
6
3D ManipulationReal-world Novel Long-Horizon Tasks
Avg. Success Rate53.33
2
3D ManipulationReal-world Atomic Tasks
Average Success Rate88.33
2
Showing 6 of 6 rows

Other info

Follow for update