Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reinforcement Learning on Halfcheetah v5

13,996.2Average Return

SAC+TQC

-171.9283,506.3367,184.610,862.864Feb 5, 2026Feb 19, 2026Mar 5, 2026Mar 19, 2026Apr 2, 2026Apr 16, 2026Apr 30, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.02
13,996.2--
2026.02
13,787.8--
2026.02
13,182.1--
2026.02
13,120.2--
2026.02
12,974--
2026.02
12,776.4--
2026.02
12,400--
2026.02
12,400--
2026.02
12,103.9--
2026.02
11,679--
2026.02
10,930.2--
2026.02
10,057--
2026.02
9,875--
2026.04
9,863.440.562.7
2026.02
9,748--
2026.02
9,748--
2026.04
9,686.338.466
2026.04
9,624.139.671.3
2026.04
9,576.937.768
2026.02
9,120--
2026.02
9,106--
2026.02
8,954--
2026.02
8,949--
2026.02
8,931--
2026.02
8,931--
2026.02
8,199.5--
2026.02
7,957--
2026.02
7,939--
2026.02
7,939--
2026.02
7,378--
2026.02
4,284.8--
2026.02
2,059--
2026.02
2,057--
2026.02
2,055--
2026.02
2,051--
2026.02
2,039--
2026.02
2,039--
2026.02
2,013--
2026.02
2,013--
2026.02
1,103--
2026.02
1,103--
2026.02
896--
2026.02
895--
2026.02
615--
2026.02
612--
2026.02
373--
2026.02
373--