Masked Generative Policy for Robotic Control
About
We present Masked Generative Policy (MGP), a novel framework for visuomotor imitation learning. We represent actions as discrete tokens, and train a conditional masked transformer that generates tokens in parallel and then rapidly refines only low-confidence tokens. We further propose two new sampling paradigms: MGP-Short, which performs parallel masked generation with score-based refinement for Markovian tasks, and MGP-Long, which predicts full trajectories in a single pass and dynamically refines low-confidence action tokens based on new observations. With globally coherent prediction and robust adaptive execution capabilities, MGP-Long enables reliable control on complex and non-Markovian tasks that prior methods struggle with. Extensive evaluations on 150 robotic manipulation tasks spanning the Meta-World and LIBERO benchmarks show that MGP achieves both rapid inference and superior success rates compared to state-of-the-art diffusion and autoregressive policies. Specifically, MGP increases the average success rate by 9% across 150 tasks while cutting per-sequence inference time by up to 35x. It further improves the average success rate by 60% in dynamic and missing-observation environments, and solves two non-Markovian scenarios where other state-of-the-art methods fail.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Robot Manipulation | MetaWorld Very Hard 5 tasks | Success Rate58.6 | 15 | |
| Multi-task imitation learning | LIBERO-90 | Success Rate88.9 | 7 | |
| Multi-task imitation learning | LIBERO Long | Success Rate82 | 7 | |
| Robot Manipulation | Meta-World Hard 5 | Assembly Success Rate100 | 6 | |
| Robotic Manipulation | Meta-World Observation Missing | Success Rate (Hard, 5 Trials)48.4 | 6 | |
| Robotic Manipulation | Meta-World Dynamic Environments | Basketball Success Rate100 | 6 | |
| Robotic Manipulation | Meta-World Single-Task (train) | Success Rate (Easy)92 | 6 | |
| Robot Manipulation | Meta-World Long-horizon tasks (test) | Success Rate (Hard)54 | 4 | |
| Sequential Button Pressing | Non-Markovian Tabletop (Button Press On/Off) | Success Rate1 | 4 | |
| Sequential Button Pressing with State Cycling | Non-Markovian Tabletop Button Press Color Change | Success Rate100 | 4 |