Implicit Neural Representations for Variable Length Human Motion Generation
About
We propose an action-conditional human motion generation method using variational implicit neural representations (INR). The variational formalism enables action-conditional distributions of INRs, from which one can easily sample representations to generate novel human motion sequences. Our method offers variable-length sequence generation by construction because a part of INR is optimized for a whole sequence of arbitrary length with temporal embeddings. In contrast, previous works reported difficulties with modeling variable-length sequences. We confirm that our method with a Transformer decoder outperforms all relevant methods on HumanAct12, NTU-RGBD, and UESTC datasets in terms of realism and diversity of generated motions. Surprisingly, even our method with an MLP decoder consistently outperforms the state-of-the-art Transformer-based auto-encoder. In particular, we show that variable-length motions generated by our method are better than fixed-length motions generated by the state-of-the-art method in terms of realism and diversity. Code at https://github.com/PACerv/ImplicitMotion.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 3D Human Motion Generation | HumanAct12 | FID0.088 | 36 | |
| 3D Human Motion Generation | UESTC | Accuracy94.1 | 14 | |
| Action-conditional motion synthesis | UESTC (train test) | FID (Train Set)9.55 | 13 | |
| Action-to-motion | UESTC (train) | FID9.55 | 5 | |
| Action-conditional motion synthesis | HumanAct12 (train) | FID (train)0.088 | 5 | |
| Action-conditioned Motion Generation | HumanAct12 (test) | FID0.088 | 5 | |
| Action-to-motion | UESTC (test) | FID15 | 5 |