Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Adaptive Human-AI Coordination via Hierarchical Action Disentanglement

About

Human-AI collaboration requires agents that can adapt to diverse partner behaviors and skill levels while remaining robust to unseen partners. Existing methods often collapse to a single dominant behavior or learn poorly aligned skills, limiting effective coordination. We propose Intrinsic Action Disentanglement (IAD), a deep hierarchical reinforcement learning (DHRL) framework that learns distinct, partner-aware low-level action sequences conditioned on high-level latent skills. IAD introduces an intrinsic reward that explicitly encourages disentangled action distributions of the agent's low-level policy across skills, yielding an interpretable mapping between high-level decisions and partner-specific behavioral responses. By capturing temporally extended interaction patterns, IAD enables flexible adaptation to heterogeneous partner dynamics under distributional shift. We evaluate IAD in the Overcooked-AI domain across multiple layouts and diverse partner settings, including unseen simulated partners, a human-proxy model trained on human-human gameplay, and real human partners. Results show that IAD consistently outperforms strong baselines and achieves more reliable, adaptive coordination across all settings.

Adnan Ahmad, Bahareh Nakisa, Mohammad Naim Rastgoo• 2026

Related benchmarks

TaskDatasetResultRank
Cooperative Multi-Agent CoordinationOvercooked-AI Cramped Room
Total Mean Reward173
4
Cooperative Multi-Agent CoordinationOvercooked-AI Asymmetric Advantages
Mean Reward140.6
4
Cooperative Multi-Agent CoordinationOvercooked-AI Coordination Ring
Total Mean Reward115.4
4
Cooperative Multi-Agent CoordinationOvercooked-AI Counter Circuit
Total Mean Reward60.42
4
Cooperative Multi-Agent CoordinationOvercooked-AI Forced Coordination
Total Mean Reward44.25
4
Cooperative Multi-Agent Reinforcement LearningOvercooked-AI Cramped Room (test)
Mean Reward148.8
4
Cooperative Multi-Agent Reinforcement LearningOvercooked-AI Asymmetric Advantages (test)
Mean Reward124.4
4
Cooperative Multi-Agent Reinforcement LearningOvercooked-AI Coordination Ring (test)
Mean Reward97.35
4
Cooperative Multi-Agent Reinforcement LearningOvercooked-AI Counter Circuit (test)
Mean Reward43.65
4
Cooperative Multi-Agent Reinforcement LearningOvercooked-AI Forced Coordination (test)
Mean Reward35.43
4
Showing 10 of 15 rows

Other info

Follow for update