Share your thoughts, 1 month free Claude Pro on usSee more

Humanoid

Benchmarks

Task Name	Dataset Name	SOTA Result
Reinforcement Learning	Humanoid	Zero-Shot Reward90,921,063	32
Reinforcement Learning	Humanoid v3	Avg Final Return11,888	26
Humanoid Locomotion	Humanoid Randomized Task (OOD Sweep)	Reward-3.58	24
High-Dimensional Bayesian Optimization	Humanoid d = 6392	Rank1	21
Continuous Control	Humanoid 17-Dof	Final Return13,860	21
Robot Locomotion	Humanoid	Cumulative Reward5,299	16
Continuous Control	Humanoid MuJoCo v2 (evaluation)	Action Performance (p_act=0.1)5,078.3	14
Reinforcement Learning	Humanoid v4	Reward5,715	13
Continuous Control	Humanoid v5	Average Return5,906.7	13
Reinforcement Learning	Humanoid (delta=[0.8^6, 0.5^6, 0.2^5], kappa=4.0) v5 (test)	Return5,620	12
Worst-case time-constrained reinforcement learning	Humanoid MuJoCo (test)	Normalized Worst-Case Reward4.02	12
Robot Locomotion	Humanoid v1 (test)	Total Score93,123.84	12
Reinforcement Learning	Humanoid v5	Performance Score5,906.7	11
Locomotion	Humanoid v4	Mean Episode Return7,365.7	10
Locomotion	Humanoid	Relative Return Improvement18.52	10
Black-box Optimization	Humanoid	Objective Value669.52	8
High-Dimensional Locomotion	Humanoid v4 (test)	Reward6,907.99	8
Reinforcement Learning	Humanoid v5	Coefficient of Variation (%)6.3	8
Reinforcement Learning	Humanoid v5	Average Returns5,228	8
Constrained Reinforcement Learning	Humanoid	Episodic Reward1,734.1	8
Reinforcement Learning	Humanoid gravity v2	Average Return6,360	8
Trajectory Optimization	Humanoid Standup	Computational Time (s)17.6	8
Continuous Control	Humanoid v4	Average Cumulative Reward4,978.5	7
Robotic Control	Humanoid v4	Local Optima Escape Rate72.3	7
Continuous Control	Humanoid	Humanoid Return (p_act=0.1)680.1	7

Showing 25 of 63 rows