| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Human Pose Estimation | J-HMDB sub | Head Accuracy99.9 | 49 | |
| Segmentation from a sentence | J-HMDB Sentences (test) | P@0.50.88 | 20 | |
| Spatio-temporal Action Localization | J-HMDB-21 | Video mAP (IoU=0.2)93.1 | 15 | |
| Action Classification | J-HMDB (averaged over 3 splits) | Accuracy76.1 | 14 | |
| Action Detection | J-HMDB | V-Score (IoU 0.5)88.1 | 10 | |
| Spatio-temporal action detection | J-HMDB (3 splits) | video-mAP (IoU=0.2)85.7 | 10 | |
| Spatio-temporal Action Detection | J-HMDB | mAP@0.5074.74 | 9 | |
| Action Recognition | sub-J-HMDB (test) | Accuracy74.6 | 9 | |
| Action Detection | J-HMDB trimmed (test) | Video-mAP (IoU=0.2)78.4 | 6 | |
| Keypoint Propagation | J-HMDB 19 (test) | PCK@.168.7 | 5 | |
| Spatial action detection | J-HMDB | Video mAP (IoU=0.5)73.47 | 5 | |
| Action Segmentation | J-HMDB (average over three splits) | mIoU68.1 | 3 | |
| Action Recognition | J-HMDB (Split 1) | Mean Per-Class Acc60.2 | 1 |