Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Learning What Matters: Adaptive Information-Theoretic Objectives for Robot Exploration

About

Designing learnable information-theoretic objectives for robot exploration remains challenging. Such objectives aim to guide exploration toward data that reduces uncertainty in model parameters, yet it is often unclear what information the collected data can actually reveal. Although reinforcement learning (RL) can optimize a given objective, constructing objectives that reflect parametric learnability is difficult in high-dimensional robotic systems. Many parameter directions are weakly observable or unidentifiable, and even when identifiable directions are selected, omitted directions can still influence exploration and distort information measures. To address this challenge, we propose Quasi-Optimal Experimental Design (Q{\footnotesize OED}), an adaptive information objective grounded in optimal experimental design. Q{\footnotesize OED} (i) performs eigenspace analysis of the Fisher information matrix to identify an observable subspace and select identifiable parameter directions, and (ii) modifies the exploration objective to emphasize these directions while suppressing nuisance effects from non-critical parameters. Under bounded nuisance influence and limited coupling between critical and nuisance directions, Q{\footnotesize OED} provides a constant-factor approximation to the ideal information objective that explores all parameters. We evaluate Q{\footnotesize OED} on simulated and real-world navigation and manipulation tasks, where identifiable-direction selection and nuisance suppression yield performance improvements of \SI{35.23}{\percent} and \SI{21.98}{\percent}, respectively. When integrated as an exploration objective in model-based policy optimization, Q{\footnotesize OED} further improves policy performance over established RL baselines.

Youwei Yu, Jionghao Wang, Zhengming Yu, Wenping Wang, Lantao Liu• 2026

Related benchmarks

TaskDatasetResultRank
Dynamics PredictionGo1 1σ
RMSE (×100)28.57
6
Dynamics PredictionGo1 2σ
RMSE0.3222
6
Dynamics PredictionGo1 3σ
RMSE (×100)35.49
6
Parameter EstimationGo1 1σ
RMSE (x100)848.3
6
Parameter EstimationGo1 3σ
RMSE (×100)848
6
Dynamics PredictionJackal 1σ
RMSE0.01
3
Dynamics PredictionJackal 2σ
RMSE0.0164
3
Dynamics PredictionJackal 3σ
RMSE (×100)2.64
3
Dynamics PredictionHand 1σ
RMSE (Scaled)3.55
3
Dynamics PredictionHand 2σ
RMSE0.0386
3
Showing 10 of 18 rows

Other info

Follow for update