Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

R&D: Balancing Reliability and Diversity in Synthetic Data Augmentation for Semantic Segmentation

About

Collecting and annotating datasets for pixel-level semantic segmentation tasks are highly labor-intensive. Data augmentation provides a viable solution by enhancing model generalization without additional real-world data collection. Traditional augmentation techniques, such as translation, scaling, and color transformations, create geometric variations but fail to generate new structures. While generative models have been employed to extend semantic information of datasets, they often struggle to maintain consistency between the original and generated images, particularly for pixel-level tasks. In this work, we propose a novel synthetic data augmentation pipeline that integrates controllable diffusion models. Our approach balances diversity and reliability data, effectively bridging the gap between synthetic and real data. We utilize class-aware prompting and visual prior blending to improve image quality further, ensuring precise alignment with segmentation labels. By evaluating benchmark datasets such as PASCAL VOC and BDD100K, we demonstrate that our method significantly enhances semantic segmentation performance, especially in data-scarce scenarios, while improving model robustness in real-world applications. Our code is available at \href{https://github.com/chequanghuy/Enhanced-Generative-Data-Augmentation-for-Semantic-Segmentation-via-Stronger-Guidance}{https://github.com/chequanghuy/Enhanced-Generative-Data-Augmentation-for-Semantic-Segmentation-via-Stronger-Guidance}.

Huy Che, Dinh-Duy Phan, Duc-Khai Lam• 2026

Related benchmarks

TaskDatasetResultRank
Semantic segmentationVOC
mIoU84
55
Drivable Area SegmentationBDD100K Foggy
mIoU80.4
3
Drivable Area SegmentationBDD100K Tunnel
mIoU (%)87.3
3
Drivable Area SegmentationBDD100K Gas Station
mIoU72.1
3
Lane Line SegmentationBDD100K Foggy
Accuracy67.8
3
Lane Line SegmentationBDD100K Tunnel
Accuracy85.2
3
Lane Line SegmentationBDD100K Gas Station
Accuracy63.7
3
Showing 7 of 7 rows

Other info

Follow for update