A Recipe for Unbounded Data Augmentation in Visual Reinforcement Learning

About

Q-learning algorithms are appealing for real-world applications due to their data-efficiency, but they are very prone to overfitting and training instabilities when trained from visual observations. Prior work, namely SVEA, finds that selective application of data augmentation can improve the visual generalization of RL agents without destabilizing training. We revisit its recipe for data augmentation, and find an assumption that limits its effectiveness to augmentations of a photometric nature. Addressing these limitations, we propose a generalized recipe, SADA, that works with wider varieties of augmentations. We benchmark its effectiveness on DMC-GB2 - our proposed extension of the popular DMControl Generalization Benchmark - as well as tasks from Meta-World and the Distracting Control Suite, and find that our method, SADA, greatly improves training stability and generalization of RL agents across a diverse set of augmentations. For visualizations, code and benchmark: see https://aalmuzairee.github.io/SADA/

Abdulaziz Almuzairee, Nicklas Hansen, Henrik I. Christensen• 2024

Related benchmarks

Task	Dataset	Result
Reinforcement Learning	DMC-GB2 Video Hard (test)	Cartpole Swingup Return363	15
PickCube	ManiSkill3 Medium Camera FOV v1 (test)	Success Rate11	7
PickCube	ManiSkill Easy Camera Fov v3 (test)	Success Rate26	7
PlaceAppleInBowl	ManiSkill3 Hard Camera Pose v1 (test)	Success Rate3	7
PokeCube	ManiSkill Medium Table Color 3 (test)	Success Rate29	7
PokeCube	ManiSkill3 Easy Camera Pose v1 (test)	Success Rate22	7
PokeCube	ManiSkill3 Medium Camera FOV v1 (test)	Success Rate11	7
PokeCube	ManiSkill3 Hard Camera Pose v1 (test)	Success Rate6	7
PullCube	ManiSkill3 Easy Ro Texture	Success Rate70	7
PullCube	ManiSkill Hard Lighting Direction 3	Success Rate52	7

Showing 10 of 241 rows

...

Other info

Follow for update

@wizwand_team Discord