Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reinforcement Learning with Action Chunking

About

We present Q-chunking, a simple yet effective recipe for improving reinforcement learning (RL) algorithms for long-horizon, sparse-reward tasks. Our recipe is designed for the offline-to-online RL setting, where the goal is to leverage an offline prior dataset to maximize the sample-efficiency of online learning. Effective exploration and sample-efficient learning remain central challenges in this setting, as it is not obvious how the offline data should be utilized to acquire a good exploratory policy. Our key insight is that action chunking, a technique popularized in imitation learning where sequences of future actions are predicted rather than a single action at each timestep, can be applied to temporal difference (TD)-based RL methods to mitigate the exploration challenge. Q-chunking adopts action chunking by directly running RL in a 'chunked' action space, enabling the agent to (1) leverage temporally consistent behaviors from offline data for more effective online exploration and (2) use unbiased $n$-step backups for more stable and efficient TD learning. Our experimental results demonstrate that Q-chunking exhibits strong offline performance and online sample efficiency, outperforming prior best offline-to-online methods on a range of long-horizon, sparse-reward manipulation tasks.

Qiyang Li, Zhiyuan Zhou, Sergey Levine• 2025

Related benchmarks

TaskDatasetResultRank
Robotic ManipulationRobomimic Can
Success Rate94
30
Robotic ManipulationRobomimic Lift
Success Rate100
28
Robotic ManipulationRobomimic Square
Success Rate92
26
Robot ManipulationRoboCasa-GR1
Average Success Rate55.5
20
NavigationOGBench humanoidmaze-medium-navigate
Success Rate (Offline)59
15
Robotic ManipulationOGBench Cube-double-task2
Success Rate100
15
Robotic ManipulationOGBench cube-double online
Success Rate100
14
Robotic ManipulationOGBench cube-triple online
Success Rate64
14
Robotic ManipulationOGBench cube-quadruple online
Success Rate77
14
Robotic ManipulationOGBench overall 25 tasks online
Success Rate86
14
Showing 10 of 98 rows
...

Other info

Follow for update