Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

The Differentiable Cross-Entropy Method

About

We study the cross-entropy method (CEM) for the non-convex optimization of a continuous and parameterized objective function and introduce a differentiable variant that enables us to differentiate the output of CEM with respect to the objective function's parameters. In the machine learning setting this brings CEM inside of the end-to-end learning pipeline where this has otherwise been impossible. We show applications in a synthetic energy-based structured prediction task and in non-convex continuous control. In the control setting we show how to embed optimal action sequences into a lower-dimensional space. DCEM enables us to fine-tune CEM-based controllers with policy optimization.

Brandon Amos, Denis Yarats• 2019

Related benchmarks

TaskDatasetResultRank
High-dimensional optimizationMSLR
Convergence Value-8.7281
21
High-dimensional optimizationLIMO
Convergence Value-4.3841
20
High-dimensional optimizationLasso-Hard
Convergence Value43.6111
20
Function OptimizationRosenbrock D=1000
Convergence Value5.27e+5
19
Function OptimizationLevy D=1000
Convergence Value134.6
19
Function OptimizationMichalewicz D=1000
Convergence Value-7.6842
19
Function OptimizationSphere D=1000
Final Value122.2
19
Function OptimizationGriewank D=1000
Convergence Value (Statistic)80.5171
19
Function OptimizationDixon D=1000
Convergence Value9.01e+5
19
High-dimensional optimizationRosenbrock D=10000
Convergence Value4.12e+5
13
Showing 10 of 15 rows

Other info

Follow for update