Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Offline-to-Online Reinforcement Learning on D4RL Antmaze (All Configurations)

74.51Success Rate (Large Diverse)

ROAD

31.8742.9454.0165.08May 14, 2026
Updated 19d ago

Evaluation Results

MethodLinks
2026.05
74.5169.139289.6786.3599.83
2026.05
70.1564.3387.668795.1697.99
2026.05
69.9969.0193.2584.9951.7597
2026.05
69.8364.1886.4986.668.6897.34
2026.05
69.4866.6686.3388.8386.9999
2026.05
66.9768.3489.8286.8381.0197.66
2026.05
65.9867.0186.6785.496.1797.16
2026.05
64.4764.7591.584.7542.2296.25
2026.05
63.1359.7583.6783.4972.1295.83
2026.05
62.5167.1886.8289.334398
2026.05
60.8346.3282.382.6725.6591.32
2026.05
60.3246.5178.6879.1769.6693.32
2026.05
56.9858.168082.563.4992.16
2026.05
56.0153.6681.6680.1638.593.34
2026.05
55.9853.1680.577.327.8393.16
2026.05
52.1552.1781.8277.1648.8292.99
2026.05
50.6646.8282.678246.5194.33
2026.05
50.3555.1283.2978.6216.1394.37
2026.05
46.8338.3481.8480.9980.6694.33
2026.05
33.516488.589.2583.596.75