T2MAC: Targeted and Trusted Multi-Agent Communication through Selective Engagement and Evidence-Driven Integration
About
Communication stands as a potent mechanism to harmonize the behaviors of multiple agents. However, existing works primarily concentrate on broadcast communication, which not only lacks practicality, but also leads to information redundancy. This surplus, one-fits-all information could adversely impact the communication efficiency. Furthermore, existing works often resort to basic mechanisms to integrate observed and received information, impairing the learning process. To tackle these difficulties, we propose Targeted and Trusted Multi-Agent Communication (T2MAC), a straightforward yet effective method that enables agents to learn selective engagement and evidence-driven integration. With T2MAC, agents have the capability to craft individualized messages, pinpoint ideal communication windows, and engage with reliable partners, thereby refining communication efficiency. Following the reception of messages, the agents integrate information observed and received from different sources at an evidence level. This process enables agents to collectively use evidence garnered from multiple perspectives, fostering trusted and cooperative behaviors. We evaluate our method on a diverse set of cooperative multi-agent tasks, with varying difficulties, involving different scales and ranging from Hallway, MPE to SMAC. The experiments indicate that the proposed model not only surpasses the state-of-the-art methods in terms of cooperative performance and communication efficiency, but also exhibits impressive generalization.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Cooperative Navigation | Cooperative Navigation easy | Mean Episode Reward2.23 | 14 | |
| Cooperative Navigation | Cooperative Navigation super_hard | Mean Episode Reward-2.53 | 7 | |
| Predator-Prey | Predator Prey easy | Mean Episode Reward-1.1 | 7 | |
| Predator-Prey | Predator Prey medium | Mean Episode Reward-1.27 | 7 | |
| Predator-Prey | Predator Prey hard | Mean Episode Reward-1.47 | 7 | |
| Predator-Prey | Predator Prey super_hard | Mean Episode Reward-1.66 | 7 | |
| Cooperative Navigation | multi-agent particle environment medium | Average Return-2.27 | 7 | |
| Cooperative Navigation | Cooperative Navigation hard | Mean Episode Reward-2.51 | 7 | |
| Multi-agent cooperation | SMAC 1o_10b_vs_1r medium | Win Rate28.91 | 7 | |
| Multi-agent cooperation | SMAC 1o_10b_vs_1r hard | Win Rate29.06 | 7 |