Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

InvertedPendulum

Benchmarks

Task NameDataset NameSOTA ResultTrend
Imitation Learning from ObservationInvertedPendulum v4
AER5.7
8
Reinforcement LearningInvertedPendulum v2
Mean Reward1,000
8
Continuous ControlInvertedPendulum v2
Average Return1,000
7
Continuous ControlInvertedPendulum v1 (train)
Max Average Return1,000
7
Showing 4 of 4 rows