Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Single-agent Reinforcement Learning (SARL) on Seaquest Clean
Loading...
1,684.1
Average Episode Return
BIRD
870.196
1,081.498
1,292.8
1,504.102
May 7, 2026
Average Episode Return
Updated 26d ago
Evaluation Results
Method
Method
Links
Average Episode Return
BIRD
2026.05
1,684.1
BehaviorGuard
2026.05
1,672.8
Original
2026.05
1,662.8
PD
2026.05
901.5
Feedback
Search any
task
Search any
task