Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
High-Dimensional Control on SafetyPointGoal1 v0 (test)
Loading...
28.25
Reward
SAC
20.3148
22.3749
24.435
26.4951
May 7, 2026
Reward
Cost
Updated 26d ago
Evaluation Results
Method
Method
Links
Reward
Cost
SAC
Discounting Variant=Ad...
2026.05
28.25
29.37
SAC
Discounting Variant=Fi...
2026.05
27.82
52.53
SAC
Discounting Variant=Un...
2026.05
27.47
37.46
SAC
Discounting Variant=Cr...
2026.05
27.34
45.84
PPO
Discounting Variant=Ad...
2026.05
26.31
41.82
PPO
Discounting Variant=Un...
2026.05
22.59
51.25
PPO
Discounting Variant=Fi...
2026.05
22.58
51.26
PPO
Discounting Variant=Cr...
2026.05
20.62
54.13
Feedback
Search any
task
Search any
task