Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Cooperative Multi-Agent Reinforcement Learning on Speaker-Listener (last 2% of train)
Loading...
-17.22
Mean Episodic Reward
SACHI
-49.3872
-41.0361
-32.685
-24.3339
May 8, 2026
Mean Episodic Reward
Updated 22d ago
Evaluation Results
Method
Method
Links
Mean Episodic Reward
SACHI
L=2, d=64, K=1
2026.05
-17.22
DGN
2026.05
-17.82
QMIX
2026.05
-18.45
MAPPO
2026.05
-19.44
IPPO
2026.05
-20.38
DCG
2026.05
-21.19
DICG
2026.05
-22.45
VDN
2026.05
-25.68
IQL
2026.05
-27.7
QPLEX
2026.05
-29.49
QTRAN
2026.05
-34.29
CASEC
2026.05
-43.79
FOP
2026.05
-48.15
Feedback
Search any
task
Search any
task