Multi-agent Long-term 3D Human Pose Forecasting via Interaction-aware Trajectory Conditioning
About
Human pose forecasting garners attention for its diverse applications. However, challenges in modeling the multi-modal nature of human motion and intricate interactions among agents persist, particularly with longer timescales and more agents. In this paper, we propose an interaction-aware trajectory-conditioned long-term multi-agent human pose forecasting model, utilizing a coarse-to-fine prediction approach: multi-modal global trajectories are initially forecasted, followed by respective local pose forecasts conditioned on each mode. In doing so, our Trajectory2Pose model introduces a graph-based agent-wise interaction module for a reciprocal forecast of local motion-conditioned global trajectory and trajectory-conditioned local pose. Our model effectively handles the multi-modality of human motion and the complexity of long-term multi-agent interactions, improving performance in complex environments. Furthermore, we address the lack of long-term (6s+) multi-agent (5+) datasets by constructing a new dataset from real-world images and 2D annotations, enabling a comprehensive evaluation of our proposed model. State-of-the-art prediction performance on both complex and simpler datasets confirms the generalized effectiveness of our method. The code is available at https://github.com/Jaewoo97/T2P.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multi-agent human pose forecasting | CMU-Mocap UMPM (test) | JPE152.4 | 8 | |
| Multi-agent human pose forecasting | 3DPW (test) | JPE142.6 | 8 | |
| Multi-agent human pose forecasting | JRDB-GlobMultiPose Short-term (test) | JPE224 | 8 | |
| Multi-agent human pose forecasting | JRDB-GlobMultiPose Long-term (test) | JPE341.6 | 8 | |
| Multi-person motion prediction | CMU-Mocap UMPM 3 persons | JPE (0.2s)38 | 8 | |
| Multi-agent Pose Forecasting | CMU-Mocap UMPM (test) | JPE (0.2s)37.8 | 4 |