Coarse-to-Fine Q-attention: Efficient Learning for Visual Robotic Manipulation via Discretisation

About

We present a coarse-to-fine discretisation method that enables the use of discrete reinforcement learning approaches in place of unstable and data-inefficient actor-critic methods in continuous robotics domains. This approach builds on the recently released ARM algorithm, which replaces the continuous next-best pose agent with a discrete one, with coarse-to-fine Q-attention. Given a voxelised scene, coarse-to-fine Q-attention learns what part of the scene to 'zoom' into. When this 'zooming' behaviour is applied iteratively, it results in a near-lossless discretisation of the translation space, and allows the use of a discrete action, deep Q-learning method. We show that our new coarse-to-fine algorithm achieves state-of-the-art performance on several difficult sparsely rewarded RLBench vision-based robotics tasks, and can train real-world policies, tabula rasa, in a matter of minutes, with as little as 3 demonstrations.

Stephen James, Kentaro Wada, Tristan Laidlow, Andrew J. Davison• 2021

Related benchmarks

Task	Dataset	Result
Robotic Manipulation	RLBench	Place Cups Success0.00e+0	63
Robotic Manipulation	RLBench (test)	Average Success Rate20.1	49
Multi-task Robotic Manipulation	RLBench	Avg Success Rate16.9	16
drag stick	RLBench	Success Rate72	10
close jar	RLBench	Success Rate28	10
open drawer	RLBench	Success Rate28	10
slide block	RLBench	Success Rate16	10
stack blocks	RLBench	Success Rate4	10
turn tap	RLBench	Success Rate68	10
meat off grill	RLBench	Success Rate40	10

Showing 10 of 22 rows

Other info

Follow for update

@wizwand_team Discord