Domain-Adaptive Self-Supervised Pre-Training for Face & Body Detection in Drawings

About

Drawings are powerful means of pictorial abstraction and communication. Understanding diverse forms of drawings, including digital arts, cartoons, and comics, has been a major problem of interest for the computer vision and computer graphics communities. Although there are large amounts of digitized drawings from comic books and cartoons, they contain vast stylistic variations, which necessitate expensive manual labeling for training domain-specific recognizers. In this work, we show how self-supervised learning, based on a teacher-student network with a modified student network update design, can be used to build face and body detectors. Our setup allows exploiting large amounts of unlabeled data from the target domain when labels are provided for only a small subset of it. We further demonstrate that style transfer can be incorporated into our learning pipeline to bootstrap detectors using a vast amount of out-of-domain labeled images from natural images (i.e., images from the real world). Our combined architecture yields detectors with state-of-the-art (SOTA) and near-SOTA performance using minimal annotation effort. Our code can be accessed from https://github.com/barisbatuhan/DASS_Detector.

Bar{\i}\c{s} Batuhan Topal, Deniz Yuret, Tevfik Metin Sezgin• 2022

Related benchmarks

Task	Dataset	Result
Object Detection	Watercolor2k (test)	mAP (Overall)89.81	113
Object Detection	Clipart1k (test)	mAP83.59	70
Object Detection	Comic2k (test)	mAP73.65	62
Body Detection	Manga 109 Bodies (test)	AP87.98	9
Body Detection	DCM 772 (Bodies) (test)	AP86.14	9
Face Detection	Manga 109 Faces (test)	AP87.88	8
Face Detection	DCM 772 Faces (test)	AP82.45	8
Face Detection	iCartoonFace (test)	AP90.01	8

Showing 8 of 8 rows

Other info

Code

Follow for update

@wizwand_team Discord