Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Implicit Quantile Networks for Distributional Reinforcement Learning

About

In this work, we build on recent advances in distributional reinforcement learning to give a generally applicable, flexible, and state-of-the-art distributional variant of DQN. We achieve this by using quantile regression to approximate the full quantile function for the state-action return distribution. By reparameterizing a distribution over the sample space, this yields an implicitly defined return distribution and gives rise to a large class of risk-sensitive policies. We demonstrate improved performance on the 57 Atari 2600 games in the ALE, and use our algorithm's implicitly defined distributions to study the effects of risk-sensitive policies in Atari games.

Will Dabney, Georg Ostrovski, David Silver, R\'emi Munos• 2018

Related benchmarks

TaskDatasetResultRank
Atari Game PlayingAtari 2600 57 games human starts evaluation metric
Median Human-Normalized Score162
14
Distributional Reinforcement LearningAmerican Put Option (test)
CVaR 1.00.4
13
Atari Game PlayingAtari 57 games 200M environment frames
Median Human-Normalized Score218
11
Reinforcement LearningWindy Lunar Lander standard (test)
Expected Value (E)32.73
10
Reinforcement Learning55 Atari games
Mean Human-Normalized Score940
10
Climate DebiasClimate Debias (test)
ED0.116
8
Precipitation DownscalingPrecip. Downscale (test)
ED0.393
8
Elliptic PDE Inverse ProblemElliptic PDE Inv. (test)
ED0.139
8
Fluid Flow PredictionNavier-Stokes (test)
Energy Distance (ED)0.263
8
GP regressionGP Regression 2D (test)
Energy Distance0.427
8
Showing 10 of 18 rows

Other info

Follow for update