The Differentiable Cross-Entropy Method

About

We study the cross-entropy method (CEM) for the non-convex optimization of a continuous and parameterized objective function and introduce a differentiable variant that enables us to differentiate the output of CEM with respect to the objective function's parameters. In the machine learning setting this brings CEM inside of the end-to-end learning pipeline where this has otherwise been impossible. We show applications in a synthetic energy-based structured prediction task and in non-convex continuous control. In the control setting we show how to embed optimal action sequences into a lower-dimensional space. DCEM enables us to fine-tune CEM-based controllers with policy optimization.

Brandon Amos, Denis Yarats• 2019

Related benchmarks

Task	Dataset	Result
High-dimensional optimization	MSLR	Convergence Value-8.7281	21
High-dimensional optimization	LIMO	Convergence Value-4.3841	20
High-dimensional optimization	Lasso-Hard	Convergence Value43.6111	20
Function Optimization	Rosenbrock D=1000	Convergence Value5.27e+5	19
Function Optimization	Levy D=1000	Convergence Value134.6	19
Function Optimization	Michalewicz D=1000	Convergence Value-7.6842	19
Function Optimization	Sphere D=1000	Final Value122.2	19
Function Optimization	Griewank D=1000	Convergence Value (Statistic)80.5171	19
Function Optimization	Dixon D=1000	Convergence Value9.01e+5	19
High-dimensional optimization	Rosenbrock D=10000	Convergence Value4.12e+5	13

Showing 10 of 15 rows

Other info

Follow for update

@wizwand_team Discord