Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Cooperative Multi-Agent Reinforcement Learning on Reference (last 2% of train)
Loading...
-25.39
Mean Episodic Reward
SACHI
-67.3332
-56.4441
-45.555
-34.6659
May 8, 2026
Mean Episodic Reward
Updated 22d ago
Evaluation Results
Method
Method
Links
Mean Episodic Reward
SACHI
L=2, d=64, K=1, attent...
2026.05
-25.39
DCG
2026.05
-27.34
QMIX
2026.05
-28.81
MAPPO
2026.05
-34.57
IPPO
2026.05
-34.69
DICG
2026.05
-34.97
FOP
2026.05
-35.38
IQL
2026.05
-36.12
VDN
2026.05
-38.44
QTRAN
2026.05
-39.33
DGN
2026.05
-41.95
CASEC
2026.05
-50.71
QPLEX
2026.05
-65.72
Feedback
Search any
task
Search any
task