Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Offline-to-online Reinforcement Learning on MinAtar

17.1Breakout Score (Online)

DRIFT

-0.43444.11788.6713.2222May 12, 2026
Updated 21d ago

Evaluation Results

MethodLinks
2026.05
17.10.430.51.171.26250.151.842.2342.640.9117.55
2026.05
17.010.010.661.132.5423.560.120.9041.60.6716.84
2026.05
16.730.320.531.051.8922.980.371048.580.6218.07
2026.05
14.690.620.581.1522.9726.50.420.95052.564.9219.17
2026.05
14.611.190.671.0419.8925.710.460.860.2754.094.519.26
2026.05
14.250.380.551.062.9925.380.240.89049.040.8318.12
2026.05
12.480.290.550.910.3124.570.380.580.1935.520.3414.81
2026.05
0.52--0.5-0.13-0.13-2.96-0.85
2026.05
0.510.50.380.5315.8515.690.280.24.524.524.314.29
2026.05
0.240.190.440.530.950.760.330.35000.380.38