Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

POPGym

Benchmarks

Task NameDataset NameSOTA ResultTrend
Reinforcement LearningPOPGym Noisy Stateless Pendulum Hard
MMER0.657
9
Reinforcement LearningPOPGym Stateless Pendulum (Hard)
MMER82.8
9
Reinforcement LearningPOPGym Noisy Stateless CartPole (Hard)
MMER20.7
9
Reinforcement LearningPOPGym Stateless CartPole Hard
MMER0.127
9
Reinforcement LearningPOPGym Aggregated (48 tasks)
Aggregated Return (All)10.4
6
Offline Reinforcement LearningPOPGym
Average Normalized Score (All)9.5
5
MemoryPOPGym Copy k=10
Temporal Range16.715
4
MemoryPOPGym Copy k=5
Temporal Range17.255
4
MemoryPOPGym Copy k=3
Temporal Range17.312
4
MemoryPOPGym Copy k=1
Temporal Range12.294
4
MemoryPOPGym RepeatFirst
Temporal Range21.177
4
ControlPOPGym Noisy Stateless CartPole
Temporal Range15.274
4
ControlPOPGym Stateless CartPole
Temporal Range13.704
4
ControlPOPGym CartPole
Temporal Range12.362
4
Showing 14 of 14 rows