Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multimodal Policy Fine-Tuning on Gaussian-mixture environment G2

100SR

DPPO

29.2847.646684.36May 12, 2026
Updated 21d ago

Evaluation Results

MethodLinks
2026.05
100421.672
2026.05
100100494
2026.05
10075374
2026.05
9250259
2026.05
3380.330
2026.05
3380.3384
2026.05
3211060