Demo2Code: From Summarizing Demonstrations to Synthesizing Code via Extended Chain-of-Thought
About
Language instructions and demonstrations are two natural ways for users to teach robots personalized tasks. Recent progress in Large Language Models (LLMs) has shown impressive performance in translating language instructions into code for robotic tasks. However, translating demonstrations into task code continues to be a challenge due to the length and complexity of both demonstrations and code, making learning a direct mapping intractable. This paper presents Demo2Code, a novel framework that generates robot task code from demonstrations via an extended chain-of-thought and defines a common latent specification to connect the two. Our framework employs a robust two-stage process: (1) a recursive summarization technique that condenses demonstrations into concise specifications, and (2) a code synthesis approach that expands each function recursively from the generated specifications. We conduct extensive evaluation on various robot task benchmarks, including a novel game benchmark Robotouille, designed to simulate diverse cooking tasks in a kitchen environment. The project's website is available at https://portal-cornell.github.io/demo2code/
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Robotic manipulation task code generation | Tabletop Manipulation simulator | Execution Success Rate100 | 30 | |
| Robotic Manipulation | RLBench standard (test) | Reach Target Success Rate94 | 12 | |
| Long-horizon manipulation | Real-world Long-Horizon Tasks (seen environment) | Task Success Rate: Make coffee0.00e+0 | 6 | |
| Code Generation from Video | EPIC-Kitchens 100 (P22-05) | Pass Rate100 | 3 | |
| Code Generation from Video | EPIC-Kitchens 100 (P22-07) | Pass Rate100 | 3 | |
| Robot task code generation | Robotouille simulator (overall) | Execution Success Rate79 | 3 | |
| Code Generation from Video | EPIC-Kitchens 100 (P4-101) | Pass Rate100 | 3 | |
| Code Generation from Video | EPIC-Kitchens 100 (P7-10) | Pass Rate1 | 3 | |
| Code Generation from Video | EPIC-Kitchens 100 (P7-04) | Pass Rate0.00e+0 | 3 | |
| Code Generation from Video | EPIC-Kitchens 100 (P30-07) | Pass Rate1 | 3 |