Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

The Laplacian Keyboard: Beyond the Linear Span

About

Across scientific disciplines, Laplacian eigenvectors serve as a fundamental basis for simplifying complex systems, from signal processing to quantum mechanics. In reinforcement learning (RL), they similarly form a basis over the state space, enabling reward functions to be approximated by projection onto a small set of eigenvectors. This projection makes zero-shot control possible, but it also imposes a fundamental limitation: the induced policies are only as expressive as the linear span of the chosen eigenvectors. We introduce the Laplacian Keyboard (LK), a hierarchical framework that goes beyond this linear span. LK constructs a task-agnostic library of behaviors from these eigenvectors, forming a behavior basis guaranteed to contain the optimal policy for any reward within the linear span. A meta-policy learns to stitch these behaviors dynamically, enabling efficient learning of policies outside the original linear constraints. We establish theoretical bounds on zero-shot approximation error and demonstrate empirically that LK improves over the zero-shot solution while achieving better sample efficiency compared to standard RL methods.

Siddarth Chandrasekar, Marlos C. Machado• 2026

Related benchmarks

TaskDatasetResultRank
FlipDMC Walker Average of APS, Proto, RND datasets
Mean Return507
3
JumpDMC Quadruped Average of APS, Proto, RND datasets
Mean Return554
3
RunDMC Quadruped Average of APS, Proto, RND datasets
Mean Return366
3
StandDMC Quadruped Average of APS, Proto, RND datasets
Mean Return705
3
WalkDMC Walker Average of APS, Proto, RND datasets
Mean Return890
3
RunDMC Cheetah Average of APS, Proto, RND datasets
Mean Return196
3
RunDMC Walker Average of APS, Proto, RND datasets
Mean Return294
3
Run-BDMC Cheetah Average of APS, Proto, RND datasets
Mean Return188
3
StandDMC Walker Average of APS, Proto, RND datasets
Mean Return635
3
WalkDMC Cheetah Average of APS, Proto, RND datasets
Mean Return709
3
Showing 10 of 12 rows

Other info

Follow for update