Variational Inference MPC using Tsallis Divergence
About
In this paper, we provide a generalized framework for Variational Inference-Stochastic Optimal Control by using thenon-extensive Tsallis divergence. By incorporating the deformed exponential function into the optimality likelihood function, a novel Tsallis Variational Inference-Model Predictive Control algorithm is derived, which includes prior works such as Variational Inference-Model Predictive Control, Model Predictive PathIntegral Control, Cross Entropy Method, and Stein VariationalInference Model Predictive Control as special cases. The proposed algorithm allows for effective control of the cost/reward transform and is characterized by superior performance in terms of mean and variance reduction of the associated cost. The aforementioned features are supported by a theoretical and numerical analysis on the level of risk sensitivity of the proposed algorithm as well as simulation experiments on 5 different robotic systems with 3 different policy parameterizations.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Trajectory Optimization | 2D Car nx=3, nu=2, Tmpc=70 Feasible Trajectories | Success Rate82.5 | 7 | |
| Trajectory Optimization | 2D Car (nx=3, nu=2, Tmpc=70) [All Trajectories] | Success Rate82.5 | 7 | |
| Trajectory Optimization | 2D Car (nx=4, nu=2, Tmpc=50) [Feasible Trajectories] | Success Rate78.1 | 7 | |
| Trajectory Optimization | 2D Car (nx=4, nu=2, Tmpc=50) [All Trajectories] | Success Rate78.1 | 7 | |
| Trajectory Optimization | Quadrotor (nx=12, nu=4, Tmpc=50) Feasible Trajectories | Success Rate75 | 7 | |
| Trajectory Optimization | Quadrotor (nx=12, nu=4, Tmpc=50) All Trajectories | Success Rate75 | 7 | |
| Trajectory Optimization | Quadrotor (nx=12, nu=4, Tmpc=70) Feasible Trajectories | Success Rate69.4 | 7 | |
| Trajectory Optimization | Quadrotor nx=12, nu=4, Tmpc=70 All Trajectories | Success Rate (%)69.4 | 7 | |
| Trajectory Optimization | Ant (nx=29, nu=8, Tmpc=40) [Feasible Trajectories] | Success Rate54 | 7 | |
| Trajectory Optimization | Ant nx=29, nu=8, Tmpc=40 All Trajectories | Success Rate54 | 7 |