Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Transformer-Based Learned Optimization

About

We propose a new approach to learned optimization where we represent the computation of an optimizer's update step using a neural network. The parameters of the optimizer are then learned by training on a set of optimization tasks with the objective to perform minimization efficiently. Our innovation is a new neural network architecture, Optimus, for the learned optimizer inspired by the classic BFGS algorithm. As in BFGS, we estimate a preconditioning matrix as a sum of rank-one updates but use a Transformer-based neural network to predict these updates jointly with the step length and direction. In contrast to several recent learned optimization-based approaches, our formulation allows for conditioning across the dimensions of the parameter space of the target problem while remaining applicable to optimization tasks of variable dimensionality without retraining. We demonstrate the advantages of our approach on a benchmark composed of objective functions traditionally used for the evaluation of optimization algorithms, as well as on the real world-task of physics-based visual reconstruction of articulated 3d human motion.

Erik G\"artner, Luke Metz, Mykhaylo Andriluka, C. Daniel Freeman, Cristian Sminchisescu• 2022

Related benchmarks

TaskDatasetResultRank
3D Human Pose EstimationHuman3.6M 18
MPJPE (P-MPJPE)82.8
7
Global OptimizationAckley 100d
Mean Final Objective Value6.80e-6
5
Global OptimizationDixon-Price 100d
Mean Final Objective Value0.463
5
Global OptimizationLevy 100d
Mean Final Objective Value3.96e-6
5
Global OptimizationAckley 2d
Mean Objective Value6.65
5
Global OptimizationPerm Function 0, d, beta 100d
Mean Final Objective Value4.80e-9
5
Global OptimizationPowel 100d
Mean Final Objective Value0.0138
5
Global OptimizationGriwank 100d
Mean Final Objective Value0.0341
5
Trajectory OptimizationHuman3.6M in domain (val)
MPJPE-G24
3
Trajectory OptimizationHuman3.6M out of domain (val)
MPJPE-G25
3
Showing 10 of 10 rows

Other info

Code

Follow for update