Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

InvertedPendulum

Benchmarks

Task NameDataset NameSOTA ResultTrend
Reinforcement LearningInvertedPendulum v2
Mean Reward1,000
27
Reinforcement LearningInvertedPendulum
Mean Reward1,000
8
Continuous ControlInvertedPendulum v5
Average Episodic Reward1,000
8
Imitation Learning from ObservationInvertedPendulum v4
AER5.7
8
Continuous ControlInvertedPendulum v2
Average Return1,000
7
Continuous ControlInvertedPendulum v1 (train)
Max Average Return1,000
7
Reinforcement LearningInvertedPendulum Gymnasium
Mean Best Reward1,000
5
Continuous ControlInvertedPendulum MuJoCo v5
Max Evaluation Return1,000
5
Reinforcement LearningInvertedPendulum v4
Average Episodic Reward1,000
4
Showing 9 of 9 rows