InvertedDoublePendulum

Benchmarks

Task Name	Dataset Name	SOTA Result
Reinforcement Learning	InvertedDoublePendulum v5	Avg AUC (z-scored)1.29	13
Reinforcement Learning	InvertedDoublePendulum v3	Average Final Return9,360	7
Continuous Control	InvertedDoublePendulum v1 (train)	Max Average Return9,355.52	7
Reinforcement Learning	InvertedDoublePendulum Gymnasium	Mean Best Reward3,609.37	5
Reinforcement Learning Surrogate Modeling	InvertedDoublePendulum (IDP) (test)	Reward Ratio (%)103	4
Policy Improvement	InvertedDoublePendulum (IDP)	Success Rate38	4
Reinforcement Learning	InvertedDoublePendulum v4	Average Episodic Reward9,167.5	4
Continuous Control	InvertedDoublePendulum v5	Average Episodic Reward9,349.2	2

Showing 8 of 8 rows