LacaDM: A Latent Causal Diffusion Model for Multiobjective Reinforcement Learning

About

Multiobjective reinforcement learning (MORL) poses significant challenges due to the inherent conflicts between objectives and the difficulty of adapting to dynamic environments. Traditional methods often struggle to generalize effectively, particularly in large and complex state-action spaces. To address these limitations, we introduce the Latent Causal Diffusion Model (LacaDM), a novel approach designed to enhance the adaptability of MORL in discrete and continuous environments. Unlike existing methods that primarily address conflicts between objectives, LacaDM learns latent temporal causal relationships between environmental states and policies, enabling efficient knowledge transfer across diverse MORL scenarios. By embedding these causal structures within a diffusion model-based framework, LacaDM achieves a balance between conflicting objectives while maintaining strong generalization capabilities in previously unseen environments. Empirical evaluations on various tasks from the MOGymnasium framework demonstrate that LacaDM consistently outperforms the state-of-art baselines in terms of hypervolume, sparsity, and expected utility maximization, showcasing its effectiveness in complex multiobjective tasks.

Xueming Yan, Bo Yin, Yaochu Jin• 2025

Related benchmarks

Task	Dataset	Result
Multi-objective Reinforcement Learning	MO-Gymnasium FruitTree	Sparsity202	8
Multi-objective Reinforcement Learning	MO-Gymnasium ResourceGathering	Sparsity633	8
Multi-objective Reinforcement Learning	MO-Gymnasium MOSwimmer	Sparsity8.77	8
Multi-objective Reinforcement Learning	MO-Gymnasium HighwayEnv	Sparsity16.5	8
Multi-objective Reinforcement Learning	MO-Gymnasium FourRoom	Sparsity315	8
Multi-objective Reinforcement Learning	MO-Gymnasium Water Reservoir	Sparsity1.92	8
Multi-objective Reinforcement Learning	MO-Gymnasium Deep Sea Treasure	Sparsity12.4	8
Multi-objective Reinforcement Learning	MO-Gymnasium BreakableBottles	Sparsity32.6	8
Multi-objective Reinforcement Learning	MO-Gymnasium Fishwood	Sparsity1.43	8
Multi-objective Reinforcement Learning	MO-Gymnasium MOLunarLander	Sparsity11.8	8

Showing 10 of 16 rows

Other info

Follow for update

@wizwand_team Discord