Adaptive Learned State Estimation based on KalmanNet
About
Hybrid state estimators that combine model-based Kalman filtering with learned components have shown promise on simulated data, yet their performance on real-world automotive data remains insufficient. In this work we present Adaptive Multi-modal KalmanNet (AM-KNet), an advancement of KalmanNet tailored to the multi-sensor autonomous driving setting. AM-KNet introduces sensor-specific measurement modules that enable the network to learn the distinct noise characteristics of radar, lidar, and camera independently. A hypernetwork with context modulation conditions the filter on target type, motion state, and relative pose, allowing adaptation to diverse traffic scenarios. We further incorporate a covariance estimation branch based on the Josephs form and supervise it through negative log-likelihood losses on both the estimation error and the innovation. A comprehensive, component-wise loss function encodes physical priors on sensor reliability, target class, motion state, and measurement flow consistency. AM-KNet is trained and evaluated on the nuScenes and View-of-Delft datasets. The results demonstrate improved estimation accuracy and tracking stability compared to the base KalmanNet, narrowing the performance gap with classical Bayesian filters on real-world automotive data.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| State estimation | View-of-Delft (test) | MAE0.19 | 24 | |
| Position State Estimation | nuScenes (test) | MAE0.26 | 16 | |
| Velocity State Estimation | nuScenes (test) | MAE0.35 | 8 | |
| State estimation | nuScenes (test) | NEES Pos. Cons. (%)60.27 | 4 | |
| State estimation | View-of-Delft (test) | NEES Position Consistency (%)76.97 | 4 |