Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AgentComm-Bench: Stress-Testing Cooperative Embodied AI Under Latency, Packet Loss, and Bandwidth Collapse

About

Cooperative multi-agent methods for embodied AI are almost universally evaluated under idealized communication: zero latency, no packet loss, and unlimited bandwidth. Real-world deployment on robots with wireless links, autonomous vehicles on congested networks, or drone swarms in contested spectrum offers no such guarantees. We introduce AgentComm-Bench, a benchmark suite and evaluation protocol that systematically stress-tests cooperative embodied AI under six communication impairment dimensions: latency, packet loss, bandwidth collapse, asynchronous updates, stale memory, and conflicting sensor evidence. AgentComm-Bench spans three task families: cooperative perception, multi-agent waypoint navigation, and cooperative zone search, and evaluates five communication strategies, including a lightweight method we propose based on redundant message coding with staleness-aware fusion. Our experiments reveal that communication-dependent tasks degrade catastrophically: stale memory and bandwidth collapse cause over 96% performance drops in navigation, while content corruption (stale or conflicting data) reduces perception F1 by over 85%. Vulnerability depends on the interaction between impairment type and task design; perception fusion is robust to packet loss but amplifies corrupted data. Redundant message coding more than doubles navigation performance under 80% packet loss. We release AgentComm-Bench as a practical evaluation protocol and recommend that cooperative embodied AI work report performance under multiple impairment conditions.

Aayam Bansal, Ishaan Gangwani• 2026

Related benchmarks

TaskDatasetResultRank
Collaborative Perception (CP)Multi-agent Communication Environment (test)
Mean Normalized Performance Drop28.5
5
Collaborative Perception (CP)Multi-agent Simulation averaged across 6 impairment dimensions
AURC (% of max)73
5
Navigation (NAV)Multi-agent Communication Environment (test)
Mean Normalized Performance Drop63.6
5
Navigation (NAV)Multi-agent Simulation averaged across 6 impairment dimensions
AURC (% of max)69.8
5
SearchMulti-agent Simulation averaged across 6 impairment dimensions
AURC (% of max)84.9
5
SearchMulti-agent Communication Environment (test)
Mean Normalized Performance Drop31.4
5
Showing 6 of 6 rows

Other info

Follow for update