Data Distillation: Towards Omni-Supervised Learning
About
We investigate omni-supervised learning, a special regime of semi-supervised learning in which the learner exploits all available labeled data plus internet-scale sources of unlabeled data. Omni-supervised learning is lower-bounded by performance on existing labeled datasets, offering the potential to surpass state-of-the-art fully supervised methods. To exploit the omni-supervised setting, we propose data distillation, a method that ensembles predictions from multiple transformations of unlabeled data, using a single model, to automatically generate new training annotations. We argue that visual recognition models have recently become accurate enough that it is now possible to apply classic ideas about self-training to challenging real-world data. Our experimental results show that in the cases of human keypoint detection and general object detection, state-of-the-art models trained with data distillation surpass the performance of using labeled data from the COCO dataset alone.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Object Detection | PASCAL VOC 2007 (test) | -- | 821 | |
| 2D Human Pose Estimation | COCO 2017 (val) | AP56.6 | 386 | |
| Object Detection | COCO (minival) | mAP37.9 | 184 | |
| Object Detection | MS-COCO 2014 (minival) | mAP33.1 | 23 | |
| Instance Segmentation | COCO 2017 (val random split) | Mask AP0.242 | 12 | |
| Instance Segmentation | COCO 1% labels (val) | AP3.8 | 7 | |
| Instance Segmentation | COCO 2% labels (val) | AP11.8 | 7 | |
| Instance Segmentation | COCO 5% labels (val) | AP20.4 | 7 | |
| Instance Segmentation | COCO 10% labels (val) | AP24.2 | 7 | |
| Hand Pose Estimation | FPHA (test) | Hand AUC (Joint)74.9 | 3 |