Learning Large-Scale Competitive Team Behaviors with Mean-Field Interactions and Online Opponent Modeling

About

While multi-agent reinforcement learning (MARL) has been proven effective across both collaborative and competitive tasks, existing algorithms often struggle to scale to large populations of agents. Recent advancements in mean-field (MF) theory provide scalable solutions by approximating population interactions as a continuum, yet most existing frameworks focus exclusively on either fully cooperative or purely competitive settings. To bridge this gap, we introduce MF-MAPPO, a mean-field extension of PPO designed for zero-sum team games that integrate intra-team cooperation with inter-team competition. MF-MAPPO employs a shared actor and a minimally informed critic per team and is trained directly on finite-population simulators, thereby enabling deployment to realistic scenarios with thousands of agents. We further show that MF-MAPPO naturally extends to partially observable settings through a simple gradient-regularized training scheme. Our evaluation utilizes large-scale benchmark scenarios using our own testing simulation platform for MF team games (MFEnv), including offense-defense battlefield tasks as well as variants of population-based rock-paper-scissors games that admit analytical solutions, for benchmarking. Across these benchmarks, MF-MAPPO outperforms existing methods and exhibits complex, heterogeneous behaviors, demonstrating the effectiveness of combining mean-field theory and MARL techniques at scale.

Bhavini Jeloka, Yue Guan, Panagiotis Tsiotras• 2025

Related benchmarks

Task	Dataset	Result	Rank
Mean Field Team Games competition	Battlefield 4x4 grid	Avg Reward86.26		25
Population-based game playing	Rock-Paper-Scissors (RPS) standard population-based	Average Reward0.00e+0		2

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord