| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Human-Object Interaction Generation | BEHAVE (test) | FID0.001 | 10 | |
| 3D Human-Object Interaction Generation | BEHAVE (test) | FID0.093 | 9 | |
| 3D human and object reconstruction | BEHAVE | CD Human4.59 | 9 | |
| Human Mesh Recovery | BEHAVE (Protocol 2) | MPJPE32.6 | 8 | |
| Human Mesh Recovery | BEHAVE (Protocol 1) | MPJPE48.9 | 8 | |
| Contact Estimation | BEHAVE (unseen) | Precision75.4 | 8 | |
| Joint Human and Object Reconstruction | BEHAVE (test) | CD (SMPL) (cm)5.241 | 8 | |
| 3D Human Reconstruction | BEHAVE | SMPL v2v Error (cm)4.99 | 8 | |
| Human Mesh Recovery | BEHAVE | PA-MPJPE22.7 | 7 | |
| Joint Human-Object Tracking | BEHAVE extended (key frames) | SMPL Chamfer Distance5.24 | 6 | |
| 3D Object Reconstruction | BEHAVE (test) | Chamfer Distance (cm)4.66 | 6 | |
| 6-DoF Object Tracking | BEHAVE (test) | ADD-S25.71 | 6 | |
| Video depth estimation | BEHAVE | Abs Rel0.033 | 5 | |
| HOI Video Generation | BEHAVE (test) | CLIPSIM0.3138 | 5 | |
| Human-Object Interaction Reconstruction | BEHAVE | Chamfer Distance6.295 | 5 | |
| adaptation to novel object and interaction skills | BEHAVE | Success Rate52 | 4 | |
| Human-Object Interaction Scene Generation | BEHAVE 3D Generative Models (evaluation set) | CLIP Score0.2968 | 4 | |
| 3D Reconstruction | BEHAVE (test) | Combined F-score @0.01m46.04 | 4 | |
| 3D Shape Reconstruction | BEHAVE real-world | Chamfer Distance0.062 | 4 | |
| Spatial position prediction | BEHAVE (unseen) | MSE0.084 | 4 | |
| Joint Human-Object Tracking | BEHAVE extended (test) | SMPL Chamfer Distance5.25 | 4 | |
| 4D human-object interaction reconstruction | BEHAVE (test) | Chamfer Distance (Human)7.25 | 3 | |
| Joint Human and Object Reconstruction | BEHAVE Extended Comparison Subset (test) | CD (SMPL) (cm)5.22 | 3 | |
| 3D Object Reconstruction | BEHAVE | v2v Distance (cm)21.2 | 3 | |
| Interaction Prediction | BEHAVE | Global MPMPE0.105 | 3 |