MEMOA: Massive Mixtures of Online Agents via Mean-Field Decentralized Nash Equilibria
About
In the modern age of large-scale AI, federated learning has become an increasingly important tool for training large populations of AI agents; however, its computational and communication costs can rapidly fail to scale with the number of agents. This is precisely where decentralized agentic strategies shine: each agent acts autonomously, using only its own state together with a minimal summary of the ensemble, namely the mean-field. We derive the unique optimal decentralized policy in closed form. Optimality is characterized through a worst-client/minimax criterion: minimizing the under-performer regret, namely the maximal online cost incurred by the weakest agent in the ensemble. We further prove that the resulting decentralized policy asymptotically converges, in the large-population limit, to the Nash-optimal centralized policy, whose direct computation is not scalable. We use an online weighting mechanism to optimize the server-computed mixture of client predictions, thereby improving the mean prediction in addition to the previously optimized weakest-client prediction. Numerical experiments verify our theoretical guarantees and demonstrate that our decentralized policy typically outperforms natural greedy decentralized baselines.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Time-series prediction | ETT | RMSE0.0263 | 29 | |
| Time-series prediction | Logistic | Average RMSE0.1883 | 20 | |
| Time-series prediction | BoC | Average RMSE0.0067 | 20 | |
| Time-series prediction | BoC (val) | Average RMSE0.0049 | 20 | |
| Time-series prediction | Periodic | Average RMSE0.9171 | 20 | |
| Time-series prediction | ETT (val) | Average RMSE0.0514 | 20 | |
| Time-series prediction | Concept | Average RMSE0.247 | 12 | |
| Time-series prediction | Concept Drift | Average RMSE0.32 | 8 |