Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Adversarial Reinforcement Learning on Connect Four 100% optimal adversary (test-time)

-0.98Avg Return

ESPER

-1.0008-0.9954-0.99-0.9846Jul 25, 2024
Updated 1mo ago

Evaluation Results

MethodLinks
2024.07
-0.98
2024.07
-1
2024.07
-1