Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Layout-Guided Controllable Pathology Image Generation with In-Context Diffusion Transformers

About

Controllable pathology image synthesis requires reliable regulation of spatial layout, tissue morphology, and semantic detail. However, existing text-guided diffusion models offer only coarse global control and lack the ability to enforce fine-grained structural constraints. Progress is further limited by the absence of large datasets that pair patch-level spatial layouts with detailed diagnostic descriptions, since generating such annotations for gigapixel whole-slide images is prohibitively time-consuming for human experts. To overcome these challenges, we first develop a scalable multi-agent LVLM annotation framework that integrates image description, diagnostic step extraction, and automatic quality judgment into a coordinated pipeline, and we evaluate the reliability of the system through a human verification process. This framework enables efficient construction of fine-grained and clinically aligned supervision at scale. Building on the curated data, we propose In-Context Diffusion Transformer (IC-DiT), a layout-aware generative model that incorporates spatial layouts, textual descriptions, and visual embeddings into a unified diffusion transformer. Through hierarchical multimodal attention, IC-DiT maintains global semantic coherence while accurately preserving structural and morphological details. Extensive experiments on five histopathology datasets show that IC-DiT achieves higher fidelity, stronger spatial controllability, and better diagnostic consistency than existing methods. In addition, the generated images serve as effective data augmentation resources for downstream tasks such as cancer classification and survival analysis.

Yuntao Shou, Xiangyong Cao, Qian Zhao, Deyu Meng• 2026

Related benchmarks

TaskDatasetResultRank
Survival PredictionTCGA-UCEC
C-index0.7415
142
Survival PredictionBLCA
C-Index0.7255
66
Survival PredictionBRCA
C-Index0.7154
66
Survival PredictionLUAD
C-index0.7251
50
Survival PredictionGBMLGG
C-index0.8811
20
Cancer ClassificationTCGA cohorts (BLCA, BRCA, GBMLGG, LUAD, UCEC) Downstream tasks
Accuracy (BLCA Cohort)89.86
10
Mask-to-Image FaithfulnessBLCA TCGA (test)
Faithfulness Score83.19
10
Mask-to-Image FaithfulnessBRCA TCGA (test)
Faithfulness Score84.12
10
Mask-to-Image FaithfulnessGBMLGG TCGA (test)
Faithfulness Score83.52
10
Mask-to-Image FaithfulnessLUAD TCGA (test)
Faithfulness Score83.63
10
Showing 10 of 18 rows

Other info

Follow for update