Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reinforcement Learning on FailureBench Bounded Soccer
Loading...
2,276.53
Avg Return
FARL
329.9308
835.2979
1,340.665
1,846.0321
Jan 12, 2026
Avg Return
Updated 3mo ago
Evaluation Results
Method
Method
Links
Avg Return
FARL
mode=fine-tuning, init...
2026.01
2,276.53
CPO
mode=fine-tuning, init...
2026.01
692.9
P3O
mode=fine-tuning, init...
2026.01
598.55
PPO-Lag
mode=fine-tuning, init...
2026.01
404.8
Feedback
Search any
task
Search any
task