| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Proficiency Estimation | Ego-Exo4D | Bouldering Proficiency Score65.41 | 16 | |
| Cross-view Instance Segmentation | Ego-Exo4D Exo-to-Ego | IoU68 | 15 | |
| Cross-view Instance Segmentation | Ego-Exo4D Ego-to-Exo | IoU67.7 | 15 | |
| Relative camera pose estimation | Ego-Exo4D (val) | Rotation Only AUC@538.54 | 13 | |
| Expert Demonstration Retrieval | Ego-Exo4D 1.0 (test) | Recall@5022.5 | 13 | |
| Expert Commentary Generation | Ego-Exo4D 1.0 (test) | BLEU-445.8 | 13 | |
| Exo-to-Ego object correspondence | Ego-Exo4D Correspondences v2 (test) | IoU49.6 | 11 | |
| Ego-to-Exo object correspondence | Ego-Exo4D Correspondences v2 (test) | IoU46.3 | 11 | |
| Cross-view Object Correspondence | Ego-Exo4D v2 (test) | Ego Query IoU42.57 | 11 | |
| Egocentric Text Retrieval | Ego-Exo4D | Physical iv Top-1 Accuracy56.1 | 8 | |
| Multi-view video understanding | Ego-Exo4D Demonstrator Proficiency | Accuracy44.2 | 7 | |
| Exo-to-Ego Video Generation | Ego-Exo4D Cooking | PSNR14.3897 | 5 | |
| Exo-to-Ego Video Generation | Ego-Exo4D Bike | PSNR15.6301 | 5 | |
| Exo-to-Ego Video Generation | Ego-Exo4D Health | PSNR16.7139 | 5 | |
| Instructional Streaming Video Generation | Ego-Exo4D KeyStep (val) | Average Score0.361 | 5 | |
| 4D Hand Motion Reconstruction | Ego-Exo4D | Jerk5.26 | 5 | |
| Motion Editing | Ego-Exo4D Basketball and Soccer (test) | Mikan Pose Improvement (P)6.18 | 4 | |
| Keystep Recognition | Ego-Exo4D | Recall Accuracy32.37 | 4 | |
| Exocentric-to-Egocentric Translation | Ego-Exo4D (unseen actions) | FID61.231 | 4 | |
| Image-Conditioned Video Editing | Ego-Exo4D (test) | Motion Consistency0.69 | 4 | |
| Correspondence | Ego-Exo4D Exo-view (test) | IoU43.8 | 4 | |
| Correspondence | Ego-Exo4D Ego-view (test) | IoU1,460 | 4 | |
| Keystep Localization | Ego-Exo4D | Rank@1 Accuracy (IoU=0.3)34.68 | 3 | |
| Correspondence | Ego-Exo4D Exo-view (val) | IoU9.6 | 1 | |
| Correspondence | Ego-Exo4D Ego-view (val) | IoU0.079 | 1 |