Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reinforcement Learning on MuJoCo Humanoid
Loading...
10,249
Average Return
SPMD
1,020.04
3,416.02
5,812
8,207.98
May 24, 2023
Nov 10, 2023
Apr 28, 2024
Oct 15, 2024
Apr 3, 2025
Sep 20, 2025
Mar 10, 2026
Average Return
Updated 1mo ago
Evaluation Results
Method
Method
Links
Average Return
SPMD
2023.05
10,249
SAC
2023.05
6,923
SiMPO-Lin. Neg.
Training Steps=1M
2026.03
5,466
SiMPO-Linear
Training Steps=1M
2026.03
5,376
SAC
Training Steps=1M
2026.03
5,298
TD3
Training Steps=1M
2026.03
5,263
DIPO
Training Steps=1M
2026.03
5,184
SiMPO-Exp
Training Steps=1M
2026.03
5,100
SiMPO-Square
Training Steps=1M
2026.03
5,068
DACER
Training Steps=1M
2026.03
3,142
QSM
Training Steps=1M
2026.03
2,308
QVPO
Training Steps=1M
2026.03
1,375
Feedback
Search any
task
Search any
task