Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DGInStyle: Domain-Generalizable Semantic Segmentation with Image Diffusion Models and Stylized Semantic Control

About

Large, pretrained latent diffusion models (LDMs) have demonstrated an extraordinary ability to generate creative content, specialize to user data through few-shot fine-tuning, and condition their output on other modalities, such as semantic maps. However, are they usable as large-scale data generators, e.g., to improve tasks in the perception stack, like semantic segmentation? We investigate this question in the context of autonomous driving, and answer it with a resounding "yes". We propose an efficient data generation pipeline termed DGInStyle. First, we examine the problem of specializing a pretrained LDM to semantically-controlled generation within a narrow domain. Second, we propose a Style Swap technique to endow the rich generative prior with the learned semantic control. Third, we design a Multi-resolution Latent Fusion technique to overcome the bias of LDMs towards dominant objects. Using DGInStyle, we generate a diverse dataset of street scenes, train a domain-agnostic semantic segmentation model on it, and evaluate the model on multiple popular autonomous driving datasets. Our approach consistently increases the performance of several domain generalization methods compared to the previous state-of-the-art methods. The source code and the generated dataset are available at https://dginstyle.github.io.

Yuru Jia, Lukas Hoyer, Shengyu Huang, Tianfu Wang, Luc Van Gool, Konrad Schindler, Anton Obukhov• 2023

Related benchmarks

TaskDatasetResultRank
Semantic segmentationMapillary (val)
mIoU62.5
153
Semantic segmentationCityScapes, BDD, and Mapillary (val)
Mean mIoU57.78
85
Semantic segmentationMapillary Vistas (val)
mIoU68.32
72
Semantic segmentationBDD100K (val)
mIoU52.3
72
Semantic segmentationBDD (val)
mIoU58.84
47
Semantic segmentationDv 19-class (val)
ACDC-19 Score50.4
46
Semantic segmentationDv 58-class (val)
ACDC-4160.2
46
Semantic segmentationGTA to {Cityscapes, BDD100K, Mapillary, ACDC, DarkZurich} (val)
mIoU (Cityscapes)58.63
31
Semantic segmentationACDC (val)
mIoU61
29
Semantic segmentationDark Zurich (val)
mIoU32.6
25
Showing 10 of 15 rows

Other info

Code

Follow for update