| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Multi-agent Offline Reinforcement Learning | MPE CN (Medium-replay) | Score95.4 | 16 | |
| Multi-agent Offline Reinforcement Learning | MPE CN (Random) | Score88.3 | 16 | |
| Multi-Agent Reinforcement Learning | MPE Adversary | Return19.1 | 11 | |
| Multi-Agent Reinforcement Learning | MPE Simple Spread | Return-46 | 11 | |
| Multi-Agent Reinforcement Learning | MPE Speaker-Listener | Return-46 | 11 | |
| Offline Multi-Agent Reinforcement Learning | MPE World (Random) | Average Normalized Score94.3 | 8 | |
| Multi-Agent Reinforcement Learning | MPE pz-mpe-simple-adversary | Params (K)124.36 | 5 | |
| Multi-Agent Reinforcement Learning | MPE pz-mpe-simple-tag | Params (K)102.28 | 5 | |
| Multi-Agent Reinforcement Learning | MPE pz-mpe-simple-spread | Number of learnable parameters (K)191.18 | 5 | |
| Simple Spread | MPE | Mean Episodic Reward-390.18 | 4 | |
| Multi-agent Reinforcement Learning | MPE Predator-prey (PP) v1 (Expert) | Normalized Score118.2 | 4 | |
| Multi-agent Reinforcement Learning | MPE Predator-prey (PP) v1 (Med-Rep) | Normalized Score71.1 | 4 | |
| Multi-agent Reinforcement Learning | MPE Predator-prey (PP) v1 (Random) | Normalized Score78.5 | 4 | |
| Multi-agent Reinforcement Learning | MPE Cooperative Navigation (CN) v1 (Expert) | Normalized Score114.9 | 4 | |
| Multi-agent Multi-objective Reinforcement Learning | MPE | Hypervolume11,080.9224 | 3 |