SHARP: Short-Window Streaming for Accurate and Robust Prediction in Motion Forecasting
About
In dynamic traffic environments, motion forecasting models must be able to accurately estimate future trajectories continuously. Streaming-based methods are a promising solution, but despite recent advances, their performance often degrades when exposed to heterogeneous observation lengths. To address this, we propose a novel streaming-based motion forecasting framework that explicitly focuses on evolving scenes. Our method incrementally processes incoming observation windows and leverages an instance-aware context streaming to maintain and update latent agent representations across inference steps. A dual training objective further enables consistent forecasting accuracy across diverse observation horizons. Extensive experiments on Argoverse 2, nuScenes, and Argoverse 1 demonstrate the robustness of our approach under evolving scene conditions and also on the single-agent benchmarks. Our model achieves state-of-the-art performance in streaming inference on the Argoverse 2 multi-agent benchmark, while maintaining minimal latency, highlighting its suitability for real-world deployment.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Motion forecasting | Argoverse 1.0 (val) | minFDE60.9 | 38 | |
| Motion Prediction | Argoverse official leaderboard (test) | minADE (6 steps)0.64 | 37 | |
| Motion forecasting | nuScenes (test) | mFDE (1s)6.02 | 11 | |
| Motion forecasting | Argoverse 1 | mADE60.59 | 11 | |
| Motion forecasting | Argoverse 2 | minADE (K=6)0.64 | 6 | |
| Motion forecasting | nuScenes | mADE51.13 | 6 | |
| Multi-agent motion forecasting | Argoverse 2 (AV2) (test) | Average minADE (K=1)1.03 | 6 |