| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Object-Centric Learning | MOVi-C | MBO^i49 | 29 | |
| Object Discovery | MOVI-C | mBOi49 | 22 | |
| Object-Centric Learning | MOVi-E | MBO^i40.1 | 22 | |
| Unsupervised Object Segmentation | MOVi-C | FG-ARI68.6 | 18 | |
| Object Segmentation | MOVi-E | FG-ARI80.8 | 13 | |
| Video Amodal Segmentation | MOVi-B | mIoU83.93 | 11 | |
| Unsupervised Object Segmentation | MOVi-E (test) | mBO43.38 | 11 | |
| Video Object Discovery | MOVi-E synthetic (test) | ARI41.6 | 8 | |
| Unsupervised Object Segmentation | MOVi-E | MBO^i40.1 | 8 | |
| Unsupervised Video Object Segmentation | MOVi-C 24 frames (val) | mIoU45.2 | 8 | |
| Unsupervised Video Object Discovery | MOVi-C conditional (test) | ARI68.4 | 7 | |
| Object Discovery | MOVi-E v1 (test) | FG-ARI82.9 | 7 | |
| Downstream Property Prediction | MOVi-C | Position Error0.01 | 7 | |
| Object Discovery | MOVi-C (val) | fg-ARI67.6 | 7 | |
| Video Object-Centric Learning | MOVi-E | FG-ARI83.7 | 6 | |
| object dynamics prediction | MOVi-C (test) | FG-ARI70 | 6 | |
| Unsupervised Image Segmentation | MOVi-C individual frames | Image FG-ARI68.6 | 6 | |
| Downstream Property Prediction | MOVi-E (test) | Position Error1.85 | 6 | |
| Rigid Body Trajectory Prediction | MOVi-B 100 frames (test) | Position RMSE (m)0.161 | 5 | |
| Rigid Body Trajectory Prediction | MOVi-B 75 frames (test) | Position RMSE (m)0.095 | 5 | |
| Rigid Body Trajectory Prediction | MOVi-A 100 frames (test) | Position RMSE (m)0.177 | 5 | |
| Rigid-body dynamics modeling | MOVi | Warmup Frames2 | 5 | |
| Object Discovery | MOVi-D | ARI45.3 | 5 | |
| Video Object-Centric Learning | MOVi-C | FG-ARI77.6 | 5 | |
| Video Object Segmentation | MOVi E | mBO-V35.6 | 5 |