Share your thoughts, 1 month free Claude Pro on usSee more

Reinforcement Learning on Ant delta=[0.2^4, 0.5^4], kappa=2.5 v5 (test)

4,260Return

DD-SRad

Updated 2mo ago

Evaluation Results

Method	Links
DD-SRad 2026.05		4,260	4.1	54.8
D-Tanh 2026.05		3,484	22.4	56.5
DD-SRad 2026.05		3,112	25.3	49.2
BoxPre+ 2026.05		2,917	9.9	51.8
D-Tanh 2026.05		2,719	6.4	57
SRad-QP 2026.05		2,147	9.1	55.4
BoxPre+ 2026.05		1,998	6.3	45.3
Post(QP) 2026.05		1,998	14.1	68.9
SRad-QP 2026.05		1,703	13.6	58.3
Post(QP) 2026.05		1,517	9	65.1
SRad-Strict 2026.05		1,448	68.8	14.1
SRad-Strict 2026.05		1,246	30.2	17.6