| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| DMControl 500k | Spin Score979 | 42 | 1mo ago | ||
| DMControl 100k | Sampled MuZero | DMControl: Finger Spin Score986.38 | 38 | 1mo ago | |
| LunarLanderContinuous offline trajectories v2 | MFRL | Episodic Cumulative Reward254.55 | 35 | 1mo ago | |
| MuJoCo Walker2d v4 | Opti-DICE | Normalized Performance13,060 | 34 | 1mo ago | |
| MuJoCo Hopper v4 | tdBN | Normalized Performance3,592 | 28 | 1mo ago | |
| MountainCar Source | Success Rate100 | 27 | 1mo ago | ||
| MuJoCo Ant | TOP-TD3 | Average Reward6,336 | 26 | 1mo ago | |
| MuJoCo HalfCheetah | TOP-TD3 | Average Reward13,144 | 25 | 6d ago | |
| MuJoCo Ant v4 | Flex-f-DICE | Normalized Return136 | 24 | 1mo ago | |
| Humanoid 17-Dof | SATR | Final Return13,860 | 21 | 1mo ago | |
| D4RL Hopper medium | OFQL | Normalized Return103.6 | 19 | 1mo ago | |
| Hopper 3-Dof | SATR | Final Return2,735 | 18 | 1mo ago | |
| MountainCar Drift II - Dynamics Shift | Success Rate100 | 18 | 1mo ago | ||
| MountainCar Drift I - Dynamics Shift | Success Rate100 | 18 | 1mo ago | ||
| MuJoCo Reacher v4 | DIDA | Normalized Performance103 | 18 | 1mo ago | |
| MuJoCo Pusher v4 | AD-SAC | Normalized Performance1.36 | 18 | 1mo ago | |
| MuJoCo HumanoidStandup v4 | VDPO | Normalized Performance1.29 | 18 | 1mo ago | |
| MuJoCo Humanoid v4 | VDPO | Normalized Performance (Ret_nor)115 | 18 | 1mo ago | |
| MuJoCo HalfCheetah v4 | AD-SAC | Normalized Performance107 | 18 | 1mo ago | |
| DMC-GB video hard | SGQN | Cartpole Swingup Score54,443 | 18 | 1mo ago | |
| MuJoCo Reacher | TRPO | Average Reward6.22 | 18 | 6d ago | |
| Walker2d v5 | DBC | Avg Return6,138.2 | 17 | 5d ago | |
| DeepMind Control Suite visual observations | DC | Acrobot Swingup Score24,829 | 16 | 1mo ago | |
| Hopper v5 | DBC | Average Return3,732.5 | 15 | 5d ago | |
| BipedalWalker v3 | MA-MPPI | Episodic Cumulative Reward298.4 | 15 | 23d ago |