Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Safe Reinforcement Learning on 15-concept neural simulation environment (test)
Loading...
34.64
Return
Unconstrained
32.6016
33.1308
33.66
34.1892
Apr 5, 2026
Return
RHSI (raw)
Jc2
Jc3
Jc4
Updated 12d ago
Evaluation Results
Method
Method
Links
Return
RHSI (raw)
Jc2
Jc3
Jc4
Unconstrained
2026.04
34.64
69.92
91.7
19.02
27.14
Reward-Shaped
2026.04
34.64
69.92
91.7
19.02
27.14
Post-hoc
2026.04
34.64
69.92
91.7
19.02
27.14
MC-CPO
frontier=enabled
2026.04
32.73
44.47
87.1
9.69
23.14
MC-CPO (no frontier)
frontier=disabled
2026.04
32.68
44.51
87.2
9.69
23.12
Feedback
Search any
task
Search any
task