Towards Robust Adaptive Object Detection under Noisy Annotations
About
Domain Adaptive Object Detection (DAOD) models a joint distribution of images and labels from an annotated source domain and learns a domain-invariant transformation to estimate the target labels with the given target domain images. Existing methods assume that the source domain labels are completely clean, yet large-scale datasets often contain error-prone annotations due to instance ambiguity, which may lead to a biased source distribution and severely degrade the performance of the domain adaptive detector de facto. In this paper, we represent the first effort to formulate noisy DAOD and propose a Noise Latent Transferability Exploration (NLTE) framework to address this issue. It is featured with 1) Potential Instance Mining (PIM), which leverages eligible proposals to recapture the miss-annotated instances from the background; 2) Morphable Graph Relation Module (MGRM), which models the adaptation feasibility and transition probability of noisy samples with relation matrices; 3) Entropy-Aware Gradient Reconcilement (EAGR), which incorporates the semantic information into the discrimination process and enforces the gradients provided by noisy and clean samples to be consistent towards learning domain-invariant representations. A thorough evaluation on benchmark DAOD datasets with noisy source annotations validates the effectiveness of NLTE. In particular, NLTE improves the mAP by 8.4\% under 60\% corrupted annotations and even approaches the ideal upper bound of training on a clean source dataset.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Object Detection | Cityscapes to Foggy Cityscapes (test) | mAP45.4 | 196 | |
| Object Detection | Foggy Cityscapes (test) | mAP (Mean Average Precision)45.4 | 108 | |
| Object Detection | PASCAL VOC to Water Color (test) | mAP40.9 | 64 | |
| Object Detection | PASCAL VOC to Clipart target domain | mAP34.1 | 61 | |
| Object Detection | Foggy Cityscapes to Cityscapes (test) | AP (person)37 | 21 | |
| Domain Adaptive Object Detection | Foggy Cityscapes (val) | AP (Person)43.1 | 18 | |
| Object Detection | Foggy Cityscapes full (val) | AP (Person)43.1 | 15 | |
| Object Detection | Pascal VOC (NR 0%) → Clipart1k 2007+2012 (test) | mAP36.5 | 10 | |
| Object Detection | Noisy Pascal VOC (NR 60%) → Clipart1k | AP (Aero)33 | 5 | |
| Object Detection | Noisy Pascal VOC NR 80% → Clipart1k | AP (Aero)36 | 5 |