Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Offline Reinforcement Learning on D4RL Locomotion Full datasets

100.8Hopper Score (m)

A2PR

50.98463.91776.8589.783Jan 28, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
100.8101.5112.111568.656.698.3103.289.794.4114.7114.297.4
2026.01
98.676.2102.7107.440.642.878.793.586.887.3110.4107.386.1
2026.01
93.992.5112.8113.2444192.595.383.677.6113.1113.889.4
2026.01
86.778.795.911048.242.29294.377.566.1106.4110.284
2026.01
85.9100.789.4110.148.544.791.693.281.890.9108.6109.787.9
2026.01
66.394.791.599.347.44486.788.978.373.9109.6109.782.5
2026.01
59.360.998100.148.344.690.782.183.781.8110.1108.280.7
2026.01
58.595105.498.44145.591.695.672.577.2108.8110.383.3
2026.01
52.918.152.510842.655.255.292.275.326107.5107.966.1