History Repeats Itself: Human Motion Prediction via Motion Attention
About
Human motion prediction aims to forecast future human poses given a past motion. Whether based on recurrent or feed-forward neural networks, existing methods fail to model the observation that human motion tends to repeat itself, even for complex sports actions and cooking activities. Here, we introduce an attention-based feed-forward network that explicitly leverages this observation. In particular, instead of modeling frame-wise attention via pose similarity, we propose to extract motion attention to capture the similarity between the current motion context and the historical motion sub-sequences. Aggregating the relevant past motions and processing the result with a graph convolutional network allows us to effectively exploit motion patterns from the long-term history to predict the future poses. Our experiments on Human3.6M, AMASS and 3DPW evidence the benefits of our approach for both periodical and non-periodical actions. Thanks to our attention model, it yields state-of-the-art results on all three datasets. Our code is available at https://github.com/wei-mao-2019/HisRepItself.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Collaborative Human Motion Prediction | ExPI unseen action 1.0 | JME80 | 150 | |
| Human Motion Prediction | Human3.6M (test) | MPJPE10.4 | 85 | |
| Multi-person motion prediction | ExPI (common action split) | A1 (A-frame) Error34 | 84 | |
| Long-term Human Motion Prediction | Human3.6M | Average Error (MPJPE)77.3 | 58 | |
| Human Motion Prediction | Human3.6M | MAE (1000ms)1.57 | 46 | |
| 3D joint position forecasting | Human3.6M | Walking Error8.1 | 40 | |
| 3D Human Motion Prediction | 3DPW (test) | MPJPE (mm)12.6 | 40 | |
| Human Pose Forecasting | AMASS BMLrub (test) | MPJPE (mm)11.3 | 40 | |
| Human Motion Prediction | Human3.6M (short-term) | -- | 40 | |
| Collaborative Human Motion Prediction | ExPI (single action split) | JME66 | 28 |