Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multimodal Policy Fine-Tuning on Gaussian-mixture environment (G1 landscape)
Loading...
100
Success Rate (SR)
DSRL
30.32
48.41
66.5
84.59
May 12, 2026
Success Rate (SR)
Success Rate Mean (SRM)
Mean Coverage (mc@80)
Entropy (H)
Updated 21d ago
Evaluation Results
Method
Method
Links
Success Rate (SR)
Success Rate Mean (SRM)
Mean Coverage (mc@80)
Entropy (H)
DSRL
Fine-tuning variant=St...
2026.05
100
25
0.25
0
DPPO
Fine-tuning variant=St...
2026.05
100
58
0.5825
0.4
RES
Fine-tuning variant=BMD
2026.05
100
100
1
0.99
DPPO
Fine-tuning variant=BMD
2026.05
100
100
1
0.99
RES
Fine-tuning variant=St...
2026.05
98
98
1
1
DPPO [10]
Fine-tuning variant=St...
2026.05
66
16
0.0825
0
DSRL
Fine-tuning variant=BMD
2026.05
33
33
0.3325
0.46
Feedback
Search any
task
Search any
task