CrackSegFlow: Controllable Flow Matching Synthesis for Generalizable Crack Segmentation with a 50K Image-Mask Benchmark
About
Defect segmentation is central to computer vision based inspection of infrastructure assets during both construction and operation. However, deployment remains limited due to scarce pixel-level labels and domain shift across environments. We introduce CrackSegFlow, a controllable Flow Matching synthesis method that renders synthetic images of cracks from masks with pixel-level alignment. Our renderer combines topology-preserving mask injection with edge gating to maintain thin-structure continuity. Class-conditional FM samples masks for topology diversity, and CrackSegFlow renders aligned ground truth images from them. We further inject cracks onto crack-free backgrounds to diversify confounders and reduce false positives. Across five datasets and using a CNN-Transformer backbone, our results demonstrate that adding synthesized pairs improves in-domain performance by +5.37 mIoU and +5.13 F1, while target-guided cross-domain synthesis driven by target mask statistics adds +13.12 mIoU and +14.82 F1. We also release CSF-50K, a benchmark dataset comprising 50,000 image-mask pairs.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Crack Segmentation | CrackTree260 (test) | mIoU37.7 | 4 | |
| Crack Segmentation | CRACK500 (test) | mIoU56.4 | 4 | |
| Crack Segmentation | CrackLS315 (test) | mIoU32 | 4 | |
| Crack Segmentation | CFD (test) | mIoU50.6 | 4 | |
| Crack Segmentation | S2DS (test) | mIoU48.1 | 4 | |
| Crack Image Synthesis | CrackTree260 | FID23.33 | 1 | |
| Crack Image Synthesis | CRACK500 | FID28.94 | 1 | |
| Crack Image Synthesis | CrackLS315 | FID21.76 | 1 | |
| Crack Image Synthesis | CFD | FID57.8 | 1 | |
| Crack Image Synthesis | S2DS | FID39.63 | 1 |