Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-Agent Strategic Reasoning on LeducHoldem OOD

83.51First-mover Normalized Score

Gemini-2.5-flash

55.11862.48969.8677.231May 6, 2026
Updated 27d ago

Evaluation Results

MethodLinks
2026.05
83.5196.96
2026.05
80.695.08
2026.05
70.1266.64
2026.05
67.667.51
2026.05
66.4269.22
2026.05
61.3556.18
2026.05
60.9364.41
2026.05
60.5560.13
2026.05
56.2155.85