Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Diffusion Modulation via Environment Mechanism Modeling for Planning

About

Diffusion models have shown promising capabilities in trajectory generation for planning in offline reinforcement learning (RL). However, conventional diffusion-based planning methods often fail to account for the fact that generating trajectories in RL requires unique consistency between transitions to ensure coherence in real environments. This oversight can result in considerable discrepancies between the generated trajectories and the underlying mechanisms of a real environment. To address this problem, we propose a novel diffusion-based planning method, termed as Diffusion Modulation via Environment Mechanism Modeling (DMEMM). DMEMM modulates diffusion model training by incorporating key RL environment mechanisms, particularly transition dynamics and reward functions. Experimental results demonstrate that DMEMM achieves state-of-the-art performance for planning with offline reinforcement learning.

Hanping Zhang, Yuhong Guo• 2026

Related benchmarks

TaskDatasetResultRank
hopper locomotionD4RL hopper medium-replay
Normalized Score100.6
56
walker2d locomotionD4RL walker2d medium-replay
Normalized Score85.8
53
LocomotionD4RL walker2d-medium-expert
Normalized Score111.6
47
LocomotionD4RL Halfcheetah medium
Normalized Score49.2
44
LocomotionD4RL Walker2d medium
Normalized Score86.5
44
hopper locomotionD4RL Hopper medium
Normalized Score101.2
38
hopper locomotionD4RL hopper-medium-expert
Normalized Score115.9
38
LocomotionD4RL halfcheetah-medium-expert
Normalized Score94.6
37
LocomotionD4RL HalfCheetah Medium-Replay
Normalized Score0.461
33
NavigationD4RL Maze2d-umaze
Normalized Return132.4
9
Showing 10 of 15 rows

Other info

Follow for update