| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| D4RL halfcheetah-medium-expert | VIPO | Normalized Score110 | 169 | 6d ago | |
| D4RL hopper-medium-expert | ATAC | Normalized Score119.2 | 161 | 23d ago | |
| D4RL walker2d-medium-expert | PSPO | Normalized Score116.1 | 132 | 23d ago | |
| D4RL Medium-Replay Hopper | NEUBAY | Normalized Score110.6 | 109 | 6d ago | |
| D4RL Medium HalfCheetah | SUMO | Normalized Score84.3 | 105 | 23d ago | |
| D4RL Medium Walker2d | NEUBAY | Normalized Score106.4 | 104 | 23d ago | |
| D4RL walker2d-random | AWAC | Normalized Score510 | 101 | 23d ago | |
| MujoCo halfcheetah | CEDGE-Policy | Normalized Return62.86 | 97 | 8d ago | |
| D4RL Medium-Replay HalfCheetah | Adaptive Policy Selection and Fine-Tuning | Normalized Score95.8 | 97 | 23d ago | |
| D4RL halfcheetah-random | ADMPO | Normalized Score45.4 | 94 | 23d ago | |
| D4RL hopper-random | MOREL | Normalized Score53.6 | 86 | 23d ago | |
| D4RL Walker2d Medium v2 | PhyB | Normalized Return95.5 | 85 | 1d ago | |
| D4RL antmaze-umaze (diverse) | FAC | Normalized Score93.5 | 74 | 5d ago | |
| D4RL Gym walker2d (medium-replay) | ROMI-CQL | Normalized Return109.7 | 73 | 5d ago | |
| D4RL Medium Hopper | RRPI | Normalized Score109.4 | 72 | 23d ago | |
| Kitchen Partial | GCPC | Normalized Score90.2 | 69 | 26d ago | |
| D4RL halfcheetah v2 (medium-replay) | CQL | Normalized Score76.9 | 68 | 1d ago | |
| hopper medium | QDFM | Normalized Score3,729 | 68 | 5d ago | |
| D4RL halfcheetah-expert v2 | PhyB | Normalized Score113.7 | 66 | 1d ago | |
| D4RL walker2d-expert v2 | PhyB | Normalized Score116.3 | 66 | 1d ago | |
| D4RL hopper-expert v2 | PhyB | Normalized Score118.9 | 66 | 1d ago | |
| D4RL AntMaze | KFC++ | AntMaze Umaze Return99.8 | 65 | 2mo ago | |
| D4RL Gym halfcheetah-medium | SPQR | Normalized Return74.8 | 65 | 5d ago | |
| MuJoCo Hopper | CEDGE-Policy | Normalized Score66.82 | 64 | 8d ago | |
| MuJoCo Walker2d | MOBODY | Normalized Score78.14 | 64 | 8d ago |