AdaFlow: Imitation Learning with Variance-Adaptive Flow-Based Policies
About
Diffusion-based imitation learning improves Behavioral Cloning (BC) on multi-modal decision-making, but comes at the cost of significantly slower inference due to the recursion in the diffusion process. It urges us to design efficient policy generators while keeping the ability to generate diverse actions. To address this challenge, we propose AdaFlow, an imitation learning framework based on flow-based generative modeling. AdaFlow represents the policy with state-conditioned ordinary differential equations (ODEs), which are known as probability flows. We reveal an intriguing connection between the conditional variance of their training loss and the discretization error of the ODEs. With this insight, we propose a variance-adaptive ODE solver that can adjust its step size in the inference stage, making AdaFlow an adaptive decision-maker, offering rapid inference without sacrificing diversity. Interestingly, it automatically reduces to a one-step generator when the action distribution is uni-modal. Our comprehensive empirical evaluation shows that AdaFlow achieves high performance with fast inference speed.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Robot Manipulation | MetaWorld 50 tasks | Success Rate (Easy)50.4 | 21 | |
| Robot Manipulation | MetaWorld Hard (6 tasks) | Success Rate12.6 | 18 | |
| Robot Manipulation | MetaWorld Medium 11 tasks | Success Rate19.1 | 18 | |
| Robot Manipulation | Meta-World | Latency (Easy) (ms)49.4 | 15 | |
| Robotic Arm Manipulation | MetaWorld Very Hard | Success Rate32.3 | 15 | |
| Robot Manipulation | MetaWorld Very Hard 5 tasks | Success Rate32.3 | 15 | |
| Robotic Arm Manipulation | MetaWorld Easy | Success Rate50.6 | 15 | |
| Lift | RoboMimic | Success Rate100 | 11 | |
| Robot Manipulation | Adroit | Hammer Task Score45 | 11 | |
| Robot Manipulation | Adroit 3 tasks | Hammer Success Rate45 | 10 |