Adaptive Action Chunking via Multi-Chunk Q Value Estimation

About

Action chunking emerged as a pivotal technique in imitation learning, enabling policies to predict cohesive action sequences rather than single actions. Recently, this approach has expanded to reinforcement learning (RL), enhancing behavioral consistency and reducing bootstrapping errors in value function estimation. However, existing methods rely on a fixed chunk length, creating a performance bottleneck as the optimal length varies across states and tasks. In this paper, we propose Adaptive Action CHunking (ACH), a novel offline-to-online RL algorithm that dynamically modulates chunk length during both training and inference. To find the optimal chunk length for a dynamically varying current state, we simultaneously estimate action-values for all candidate chunk lengths in a single forward pass, using a Transformer-based architecture. Our mechanism allows the agent to select the most effective chunk length adaptively based on the current state. Evaluated on 34 challenging tasks, ACH consistently outperforms fixed-length baselines, demonstrating superior generalization and learning efficiency in complex environments.

Yongjae Shin, Jongseong Chae, Seongmin Kim, Jongeui Park, Youngchul Sung• 2026

Related benchmarks

Task	Dataset	Result
Navigation	OGBench humanoidmaze-medium-navigate	Success Rate (Offline)45	15
Robotic Manipulation	OGBench puzzle-4x4-play	Success Rate (Offline)13	12
Navigation	OGBench antmaze-giant-navigate (Aggregated across five tasks)	Offline Performance1	6
Manipulation	OGBench cube-triple-play Aggregated across five tasks	Offline Performance1	6
Manipulation	OGBench cube-quadruple-play-10M	Offline Performance0.00e+0	6
Navigation	OGBench antsoccer-arena-navigate	Offline Performance0.00e+0	6

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord