Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Dynamics-Aligned Shared Hypernetworks for Zero-Shot Actuator Inversion

About

Zero-shot generalization in contextual reinforcement learning remains a core challenge, particularly when the context is latent and must be inferred from data. A canonical failure mode is actuator inversion, where identical actions produce opposite physical effects under a latent binary context. We propose DMA*-SH, a framework where a single hypernetwork, trained solely via dynamics prediction, generates a small set of adapter weights shared across the dynamics model, policy, and action-value function. This shared modulation imparts an inductive bias matched to actuator inversion, while input/output normalization and random input masking stabilize context inference, promoting directionally concentrated representations. We provide theoretical support via an expressivity separation result for hypernetwork modulation, and a variance decomposition with policy-gradient variance bounds that formalize how within-mode compression improves learning under actuator inversion. For evaluation, we introduce the Actuator Inversion Benchmark (AIB), a suite of environments designed to isolate discontinuous context-to-dynamics interactions. On AIB's held-out actuator-inversion tasks, DMA*-SH achieves zero-shot generalization, outperforming domain randomization by 111.8% and surpassing a standard context-aware baseline by 16.1%.

Jan Benad, Pradeep Kr. Banerjee, Frank R\"oder, Nihat Ay, Martin V. Butz, Manfred Eppe• 2026

Related benchmarks

TaskDatasetResultRank
Actuator InversionBallInCup (eval-in)
AER955
8
Actuator InversionDI-Friction C (train)
AER71
8
Actuator InversionDI-Friction (Ceval-in)
AER0.71
8
Zero-Shot Actuator InversionAIB Cheetah environment Ceval-out
AER225
8
Actuator InversionWalker C (train)
AER885
8
Actuator InversionWalkerGym C (train)
AER3.33e+3
8
Actuator InversionWalker (Ceval-in)
AER888
8
Actuator InversionWalkerGym (Ceval-in)
AER3.38e+3
8
Actuator InversionHopperGym (Ceval-in)
AER2.85e+3
8
Zero-Shot Actuator InversionAIB DI-Friction environment Ceval-out
AER62
8
Showing 10 of 42 rows

Other info

Follow for update