The Differentiable Cross-Entropy Method
About
We study the cross-entropy method (CEM) for the non-convex optimization of a continuous and parameterized objective function and introduce a differentiable variant that enables us to differentiate the output of CEM with respect to the objective function's parameters. In the machine learning setting this brings CEM inside of the end-to-end learning pipeline where this has otherwise been impossible. We show applications in a synthetic energy-based structured prediction task and in non-convex continuous control. In the control setting we show how to embed optimal action sequences into a lower-dimensional space. DCEM enables us to fine-tune CEM-based controllers with policy optimization.
Brandon Amos, Denis Yarats• 2019
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| High-dimensional optimization | MSLR | Convergence Value-8.7281 | 21 | |
| High-dimensional optimization | LIMO | Convergence Value-4.3841 | 20 | |
| High-dimensional optimization | Lasso-Hard | Convergence Value43.6111 | 20 | |
| Function Optimization | Rosenbrock D=1000 | Convergence Value5.27e+5 | 19 | |
| Function Optimization | Levy D=1000 | Convergence Value134.6 | 19 | |
| Function Optimization | Michalewicz D=1000 | Convergence Value-7.6842 | 19 | |
| Function Optimization | Sphere D=1000 | Final Value122.2 | 19 | |
| Function Optimization | Griewank D=1000 | Convergence Value (Statistic)80.5171 | 19 | |
| Function Optimization | Dixon D=1000 | Convergence Value9.01e+5 | 19 | |
| High-dimensional optimization | Rosenbrock D=10000 | Convergence Value4.12e+5 | 13 |
Showing 10 of 15 rows