The Laplacian Keyboard: Beyond the Linear Span

About

Across scientific disciplines, Laplacian eigenvectors serve as a fundamental basis for simplifying complex systems, from signal processing to quantum mechanics. In reinforcement learning (RL), they similarly form a basis over the state space, enabling reward functions to be approximated by projection onto a small set of eigenvectors. This projection makes zero-shot control possible, but it also imposes a fundamental limitation: the induced policies are only as expressive as the linear span of the chosen eigenvectors. We introduce the Laplacian Keyboard (LK), a hierarchical framework that goes beyond this linear span. LK constructs a task-agnostic library of behaviors from these eigenvectors, forming a behavior basis guaranteed to contain the optimal policy for any reward within the linear span. A meta-policy learns to stitch these behaviors dynamically, enabling efficient learning of policies outside the original linear constraints. We establish theoretical bounds on zero-shot approximation error and demonstrate empirically that LK improves over the zero-shot solution while achieving better sample efficiency compared to standard RL methods.

Siddarth Chandrasekar, Marlos C. Machado• 2026

Related benchmarks

Task	Dataset	Result
Flip	DMC Walker Average of APS, Proto, RND datasets	Mean Return507	3
Jump	DMC Quadruped Average of APS, Proto, RND datasets	Mean Return554	3
Run	DMC Quadruped Average of APS, Proto, RND datasets	Mean Return366	3
Stand	DMC Quadruped Average of APS, Proto, RND datasets	Mean Return705	3
Walk	DMC Walker Average of APS, Proto, RND datasets	Mean Return890	3
Run	DMC Cheetah Average of APS, Proto, RND datasets	Mean Return196	3
Run	DMC Walker Average of APS, Proto, RND datasets	Mean Return294	3
Run-B	DMC Cheetah Average of APS, Proto, RND datasets	Mean Return188	3
Stand	DMC Walker Average of APS, Proto, RND datasets	Mean Return635	3
Walk	DMC Cheetah Average of APS, Proto, RND datasets	Mean Return709	3

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord