Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Continuous Control on Ant v4
Loading...
7,518.3
Average Return
SMFP
1,609.1968
3,143.2909
4,677.385
6,211.4791
May 6, 2025
Jul 8, 2025
Sep 9, 2025
Nov 11, 2025
Jan 13, 2026
Mar 17, 2026
May 20, 2026
Average Return
Updated 13d ago
Evaluation Results
Method
Method
Links
Average Return
SMFP
Training Time (h)=2.7,...
2026.05
7,518.3
DIME
Training Time (h)=3.5,...
2026.05
7,103.6
MaxEntDP
Training Time (h)=2.3,...
2026.05
5,717.9
DIPO
Training Time (h)=9.6,...
2026.05
5,665.9
CSAC
2025.05
5,538.2
SAC
Training Time (h)=2.2,...
2026.05
5,530.6
SAC
2025.05
5,229.39
TD3
Training Time (h)=0.4,...
2026.05
4,583.8
QSM
Training Time (h)=0.8,...
2026.05
4,206.4
QVPO
Training Time (h)=5.2,...
2026.05
4,040.1
SD3
2025.05
2,960.61
PPO
Training Time (h)=0.3,...
2026.05
2,781.9
TD3
2025.05
2,368.35
SPO
Training Time (h)=0.3,...
2026.05
2,100.2
PPO
2025.05
1,836.47
Feedback
Search any
task
Search any
task