Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Ant

Benchmarks

Task NameDataset NameSOTA ResultTrend
Reinforcement LearningAnt v5
Average Return6,633.8
57
Continuous Robot ControlAnt v3 (test)
Reward5,648
48
Reinforcement LearningAnt v3
Average Final Return9,108
26
LocomotionAnt IID (test)
Mean Episode Reward2,240
24
Locomotion ControlAnt sigma 0.5 (test)
Episode Reward974
24
Locomotion ControlAnt sigma 0.3 (test)
Episode Reward1,723
24
Locomotion ControlAnt sigma 0.1 (test)
Episode Reward2,240
24
Locomotion ControlAnt sigma 0.7 (test)
Episode Reward306
18
Offline Reinforcement LearningAnt kinematic shifts
Score120
16
Offline Reinforcement LearningAnt Medium D4RL
Normalized Score96.4
14
Offline Policy Adaptationant medium-expert
Normalized Score79.3
14
Offline Policy Adaptationant medium-replay
Normalized Score76.2
14
Offline Policy Adaptationant medium
Normalized Score77.2
14
Continuous ControlAnt v5
Normalized Mean Return1.14
12
Reinforcement LearningAnt fixed linear adversary
Average Performance8,069
12
Worst-case time-constrained reinforcement learningAnt MuJoCo (test)
Normalized Worst-Case Reward1.66
12
Robust Reinforcement LearningAnt MuJoCo (fixed exponential adversary)
Average Performance7,724
12
Continuous ControlAnt MuJoCo (test)
Worst-case Performance7,534
12
Robot LocomotionAnt v1 (test)
Performance Score2,370.93
12
Locomotion controlAnt 1% offline data
Optimization Performance Score96
11
Imitation LearningAnt one-shot v2
Normalized Score29.7
11
Reinforcement LearningAnt v4
Average Return5,527
9
Continuous ControlAnt v5
Average Return6,501.4
9
Offline OptimizationAnt 1% offline data
Score96
8
Reinforcement LearningAnt v5
Coefficient of Variation4.2
8
Showing 25 of 84 rows