Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Offline-to-Online Reinforcement Learning on D4RL 6 environments min-max normalized (averaged)

0.031Normalized Regret

SMAC

-0.006240.245130.49650.74787Feb 19, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
0.031
2026.02
0.09
2026.02
0.226
2026.02
0.38
2026.02
0.442
2026.02
0.448
2026.02
0.471
2026.02
0.482
2026.02
0.494
2026.02
0.508
2026.02
0.545
2026.02
0.562
2026.02
0.614
2026.02
0.653
2026.02
0.654
2026.02
0.962