Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Offline Multi-Agent Reinforcement Learning on SMAC
Loading...
97
3s5z Win Rate
DLM-GRPO
1.32
26.16
51
75.84
Apr 26, 2026
3s5z Win Rate
10m_vs_11m Win Rate
5m_vs_6m Win Rate
3s_vs_5z Win Rate
6h_vs_8z Win Rate
CORRIDOR Win Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
3s5z Win Rate
10m_vs_11m Win Rate
5m_vs_6m Win Rate
3s_vs_5z Win Rate
6h_vs_8z Win Rate
CORRIDOR Win Rate
DLM-GRPO
Type=LLM-based, Traini...
2026.04
97
100
80
94
75
92
DLM-SFT
Type=LLM-based, Traini...
2026.04
94
98
72
84
67
81
MADT
Type=LLM-based
2026.04
81
88
67
72
53
64
GATO
Type=LLM-based
2026.04
72
92
63
65
37
56
SAYCAN
Type=LLM-based
2026.04
5
10
7
8
2
0
Feedback
Search any
task
Search any
task