Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reinforcement Learning on Reacher Easy OOD
Loading...
158.5
Score (mass×0.7)
HaM-World
2.708
43.154
83.6
124.046
May 7, 2026
Score (mass×0.7)
Score (mass×1.3)
Score (damp×0.5)
Score (damp×2.0)
Score (act×0.7)
Score (act×1.3)
Average Score
Updated 26d ago
Evaluation Results
Method
Method
Links
Score (mass×0.7)
Score (mass×1.3)
Score (damp×0.5)
Score (damp×2.0)
Score (act×0.7)
Score (act×1.3)
Average Score
HaM-World
protocol=zero-shot
2026.05
158.5
158.8
141.7
139.7
148.8
151.8
149.9
TD-MPC2
protocol=zero-shot
2026.05
135.3
130.1
131.8
125.1
131.5
138.1
132
SAC
protocol=zero-shot
2026.05
98.5
98
86
83
88.6
100.2
92.4
PPO
protocol=zero-shot
2026.05
11.7
11.8
14.6
13.9
12.7
12.9
12.9
DreamerV3
protocol=zero-shot
2026.05
8.7
11.4
9.3
8.1
8.9
9.4
9.3
Feedback
Search any
task
Search any
task