Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-Agent Strategic Reasoning on Kuhn Poker In-domain
Loading...
65.6
Success Rate
GiGPO
59.152
60.826
62.5
64.174
Mar 4, 2026
Success Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
Success Rate
GiGPO
Category=RL
2026.03
65.6
MAGE
Category=Meta-RL
2026.03
65.6
ReAct
Category=In-Context Le...
2026.03
64.8
Reflexion
Category=In-Context Le...
2026.03
64.8
GRPO
Category=RL
2026.03
64.8
A-MEM
Category=Memory-based
2026.03
64.1
Memento
Category=Memory-based
2026.03
64.1
LAMER
Category=Meta-RL
2026.03
59.4
Feedback
Search any
task
Search any
task