Co-mining: Self-Supervised Learning for Sparsely Annotated Object Detection

About

Object detectors usually achieve promising results with the supervision of complete instance annotations. However, their performance is far from satisfactory with sparse instance annotations. Most existing methods for sparsely annotated object detection either re-weight the loss of hard negative samples or convert the unlabeled instances into ignored regions to reduce the interference of false negatives. We argue that these strategies are insufficient since they can at most alleviate the negative effect caused by missing annotations. In this paper, we propose a simple but effective mechanism, called Co-mining, for sparsely annotated object detection. In our Co-mining, two branches of a Siamese network predict the pseudo-label sets for each other. To enhance multi-view learning and better mine unlabeled instances, the original image and corresponding augmented image are used as the inputs of two branches of the Siamese network, respectively. Co-mining can serve as a general training mechanism applied to most of modern object detectors. Experiments are performed on MS COCO dataset with three different sparsely annotated settings using two typical frameworks: anchor-based detector RetinaNet and anchor-free detector FCOS. Experimental results show that our Co-mining with RetinaNet achieves 1.4%~2.1% improvements compared with different baselines and surpasses existing methods under the same sparsely annotated setting. Code is available at https://github.com/megvii-research/Co-mining.

Tiancai Wang, Tong Yang, Jiale Cao, Xiangyu Zhang• 2020

Related benchmarks

Task	Dataset	Result
3D Object Detection	ScanNet V2 (val)	mAP@0.2543.3	361
3D Object Detection	SUN RGB-D (val)	mAP@0.2555.9	163
Monocular 3D Object Detection	KITTI (test)	AP3D R40 (Mod.)6.41	44
Monocular 3D Object Detection	KITTI (val)	--	17
BEV Object Detection	KITTI Clear	APBEV (Easy)24.81	6
3D Object Detection	KITTI Clear	AP3D (Easy)16.01	6
3D Object Detection	KITTI 30% annotation ratio (val)	AP3D (Easy)16.01	6
3D Object Detection	KITTI Foggy	AP3D (Easy)11.22	6
3D Object Detection	KITTI 10% annotation ratio (val)	AP3D (Easy)0.00e+0	6
3D Object Detection	KITTI 20% annotation ratio (val)	AP3D Easy3.88	6

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord