Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Adaptive Action Chunking via Multi-Chunk Q Value Estimation

About

Action chunking emerged as a pivotal technique in imitation learning, enabling policies to predict cohesive action sequences rather than single actions. Recently, this approach has expanded to reinforcement learning (RL), enhancing behavioral consistency and reducing bootstrapping errors in value function estimation. However, existing methods rely on a fixed chunk length, creating a performance bottleneck as the optimal length varies across states and tasks. In this paper, we propose Adaptive Action CHunking (ACH), a novel offline-to-online RL algorithm that dynamically modulates chunk length during both training and inference. To find the optimal chunk length for a dynamically varying current state, we simultaneously estimate action-values for all candidate chunk lengths in a single forward pass, using a Transformer-based architecture. Our mechanism allows the agent to select the most effective chunk length adaptively based on the current state. Evaluated on 34 challenging tasks, ACH consistently outperforms fixed-length baselines, demonstrating superior generalization and learning efficiency in complex environments.

Yongjae Shin, Jongseong Chae, Seongmin Kim, Jongeui Park, Youngchul Sung• 2026

Related benchmarks

TaskDatasetResultRank
NavigationOGBench humanoidmaze-medium-navigate
Success Rate (Offline)45
15
Robotic ManipulationOGBench puzzle-4x4-play
Success Rate (Offline)13
12
NavigationOGBench antmaze-giant-navigate (Aggregated across five tasks)
Offline Performance1
6
ManipulationOGBench cube-triple-play Aggregated across five tasks
Offline Performance1
6
ManipulationOGBench cube-quadruple-play-10M
Offline Performance0.00e+0
6
NavigationOGBench antsoccer-arena-navigate
Offline Performance0.00e+0
6
Showing 6 of 6 rows

Other info

Follow for update