Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Ada-Diffuser: Latent-Aware Adaptive Diffusion for Decision-Making

About

Recent work has framed decision-making as a sequence modeling problem using generative models such as diffusion models. Although promising, these approaches often overlook latent factors that exhibit evolving dynamics, elements that are fundamental to environment transitions, reward structures, and high-level agent behavior. Explicitly modeling these hidden processes is essential for both precise dynamics modeling and effective decision-making. In this paper, we propose a unified framework that explicitly incorporates latent dynamic inference into generative decision-making from minimal yet sufficient observations. We theoretically show that under mild conditions, the latent process can be identified from small temporal blocks of observations. Building on this insight, we introduce Ada-Diffuser, a causal diffusion model that learns the temporal structure of observed interactions and the underlying latent dynamics simultaneously, and furthermore, leverages them for planning and control. With a modular design, Ada-Diffuser supports both planning and policy learning tasks, enabling adaptation to latent variations in dynamics, rewards, and latent actions. Experiments on simulated control and robotic benchmarks demonstrate its effectiveness in accurate latent inference and adaptive policy learning.

Fan Feng, Selena Ge, Minghao Fu, Zijian Li, Yujia Zheng, Zeyu Tang, Yingyao Hu, Biwei Huang, Kun Zhang• 2026

Related benchmarks

TaskDatasetResultRank
Robotic ManipulationLIBERO
Spatial Success Rate79.2
527
Robotic ManipulationRobomimic Can
Success Rate98
30
Robotic ManipulationRobomimic Lift
Success Rate98
28
Robotic ManipulationRobomimic Square
Success Rate89
26
LocomotionCheetah-Wind-E (c^s)
Average Return62.4
14
NavigationD4RL Maze2d-umaze
Normalized Return148.6
14
LocomotionCheetah-Wind-S c^s
Average Return-65.3
14
LocomotionCheetah-Vel-E (c^r)
Average Return-39.2
14
LocomotionAnt-Dir-E c^r
Average Return296.4
14
Reinforcement LearningCheetah-Wind-E dynamics changes episodic
Average Return-41.6
8
Showing 10 of 22 rows

Other info

Follow for update