| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| LunarLanderContinuous offline trajectories v2 | MFRL | Episodic Cumulative Reward254.55 | 35 | 4d ago | |
| DMControl 500k | MLR | Spin Score973 | 33 | 4d ago | |
| DMControl 100k | Sampled MuZero | DMControl: Finger Spin Score986.38 | 29 | 4d ago | |
| MountainCar Source | Success Rate100 | 27 | 4d ago | ||
| MuJoCo Walker2d v4 | Opti-DICE | Normalized Performance13,060 | 24 | 4d ago | |
| MuJoCo Ant v4 | Flex-f-DICE | Normalized Return136 | 24 | 4d ago | |
| Humanoid 17-Dof | SATR | Final Return13,860 | 21 | 4d ago | |
| D4RL Hopper medium | OFQL | Normalized Return103.6 | 19 | 4d ago | |
| Hopper 3-Dof | SATR | Final Return2,735 | 18 | 4d ago | |
| MountainCar Drift II - Dynamics Shift | Success Rate100 | 18 | 4d ago | ||
| MountainCar Drift I - Dynamics Shift | Success Rate100 | 18 | 4d ago | ||
| MuJoCo Reacher v4 | DIDA | Normalized Performance103 | 18 | 4d ago | |
| MuJoCo Pusher v4 | AD-SAC | Normalized Performance1.36 | 18 | 4d ago | |
| MuJoCo HumanoidStandup v4 | VDPO | Normalized Performance1.29 | 18 | 4d ago | |
| MuJoCo Humanoid v4 | VDPO | Normalized Performance (Ret_nor)115 | 18 | 4d ago | |
| MuJoCo Hopper v4 | Normalized Performance1.25 | 18 | 4d ago | ||
| MuJoCo HalfCheetah v4 | AD-SAC | Normalized Performance107 | 18 | 4d ago | |
| DMC-GB video hard | SGQN | Cartpole Swingup Score54,443 | 18 | 4d ago | |
| DeepMind Control Suite visual observations | DC | Acrobot Swingup Score24,829 | 16 | 4d ago | |
| hopper | RMF | Average Reward2,133,326 | 15 | 4d ago | |
| MountainCar Explicit Structural Drift II | Success Rate (Source)100 | 14 | 4d ago | ||
| D4RL Walker2d medium | COMBO | Normalized Return81.9 | 14 | 4d ago | |
| HalfCheetah v5 | DS-TD3 | Normalized Mean Return1.05 | 12 | 4d ago | |
| Ant v5 | RNM-TD3 | Normalized Mean Return1.14 | 12 | 4d ago | |
| Walker2d 6-Dof | SATR | Final Return5,143 | 12 | 4d ago |