Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Constrained Reinforcement Learning for Tutoring Curricula on neural environment 25-concept
Loading...
34.79
Return
Unconstrained
31.8572
32.6186
33.38
34.1414
Apr 5, 2026
Return
RHSI (raw)
Saturation Rate
Updated 12d ago
Evaluation Results
Method
Method
Links
Return
RHSI (raw)
Saturation Rate
Unconstrained
2026.04
34.79
70.52
0
Reward Shaped
lambda={0.05, 0.1, 0.2...
2026.04
34.79
70.52
0
Posthoc
2026.04
34.79
70.52
0
MC-CPO (no frontier)
epsilon_min=0
2026.04
32.11
40.05
70
MC-CPO
epsilon_min=0.05
2026.04
31.97
39.16
80
Feedback
Search any
task
Search any
task