Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reinforcement Learning on Hopper v4

27,721,263Average Return

pop-SAN

-1,107,133.6886,377,161.60613,861,456.921,345,752.194Jan 29, 2026Feb 4, 2026Feb 11, 2026Feb 18, 2026Feb 24, 2026Mar 3, 2026Mar 10, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.02
27,721,263
2026.02
3,446,131
2026.02
3,410,164
2026.02
3,403,148
2026.02
3,385,157
2026.02
3,098,281
2026.02
356,568
352,094
2026.01
3,462
2026.01
3,414
2026.01
3,384
2026.01
3,380
2026.01
3,349
2026.03
2,944.3
2026.03
2,329.7
2026.03
2,017
2026.03
1,650.8