Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

A Schr\"odinger Eigenfunction Method for Long-Horizon Stochastic Optimal Control

About

High-dimensional stochastic optimal control (SOC) becomes harder with longer planning horizons: existing methods scale linearly in the horizon $T$, with performance often deteriorating exponentially. We overcome these limitations for a subclass of linearly-solvable SOC problems-those whose uncontrolled drift is the gradient of a potential. In this setting, the Hamilton-Jacobi-Bellman equation reduces to a linear PDE governed by an operator $\mathcal{L}$. We prove that, under the gradient drift assumption, $\mathcal{L}$ is unitarily equivalent to a Schr\"odinger operator $\mathcal{S} = -\Delta + \mathcal{V}$ with purely discrete spectrum, allowing the long-horizon control to be efficiently described via the eigensystem of $\mathcal{L}$. This connection provides two key results: first, for a symmetric linear-quadratic regulator (LQR), $\mathcal{S}$ matches the Hamiltonian of a quantum harmonic oscillator, whose closed-form eigensystem yields an analytic solution to the symmetric LQR with \emph{arbitrary} terminal cost. Second, in a more general setting, we learn the eigensystem of $\mathcal{L}$ using neural networks. We identify implicit reweighting issues with existing eigenfunction learning losses that degrade performance in control tasks, and propose a novel loss function to mitigate this. We evaluate our method on several long-horizon benchmarks, achieving an order-of-magnitude improvement in control accuracy compared to state-of-the-art methods, while reducing memory usage and runtime complexity from $\mathcal{O}(Td)$ to $\mathcal{O}(d)$.

Louis Claeys, Artur Goldman, Zebang Shen, Niao He• 2026

Related benchmarks

TaskDatasetResultRank
Stochastic Optimal ControlQUADRATIC ANISOTROPIC
Control Objective31.3476
9
Stochastic Optimal ControlDouble-Well
Control Objective Value32.4421
9
Stochastic Optimal ControlQUADRATIC ISOTROPIC
Control Objective32.7763
9
Stochastic Optimal ControlQUADRATIC REPULSIVE
Control Objective150.1
8
Networked control problemOpinion dynamics De Groot model N=10 agents 80,000 iterations (train)
Control Objective Score73.33
4
Showing 5 of 5 rows

Other info

Follow for update