Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

A Synthetic Data-Driven Radiology Foundation Model for Pan-tumor Clinical Diagnosis

About

AI-assisted imaging made substantial advances in tumor diagnosis and management. However, a major barrier to developing robust oncology foundation models is the scarcity of large-scale, high-quality annotated datasets, which are limited by privacy restrictions and the high cost of manual labeling. To address this gap, we present PASTA, a pan-tumor radiology foundation model built on PASTA-Gen, a synthetic data framework that generated 30,000 3D CT scans with pixel-level lesion masks and structured reports of tumors across ten organ systems. Leveraging this resource, PASTA achieves state-of-the-art performance on 45 of 46 oncology tasks, including non-contrast CT tumor screening, lesion segmentation, structured reporting, tumor staging, survival prediction, and MRI-modality transfer. To assess clinical applicability, we developed PASTA-AID, a clinical decision support system, and ran a retrospective simulated clinical trial across two scenarios. For pan-tumor screening on plain CT with fixed reading time, PASTA-AID increased radiologists' throughput by 11.1-25.1% and improved sensitivity by 17.0-31.4% and precision by 10.5-24.9%; additionally, in a diagnosis-aid workflow, it reduced segmentation time by up to 78.2% and reporting time by up to 36.5%. Beyond gains in accuracy and efficiency, PASTA-AID narrowed the expertise gap, enabling less-experienced radiologists to approach expert-level performance. Together, this work establishes an end-to-end, synthetic data-driven pipeline spanning data generation, model development, and clinical validation, thereby demonstrating substantial potential for pan-tumor research and clinical translation.

Wenhui Lei, Hanyu Chen, Zitian Zhang, Luyang Luo, Qiong Xiao, Yannian Gu, Peng Gao, Yankai Jiang, Ci Wang, Guangtao Wu, Tongjia Xu, Yingjie Zhang, Pranav Rajpurkar, Xiaofan Zhang, Shaoting Zhang, Zhenning Wang• 2025

Related benchmarks

TaskDatasetResultRank
SegmentationLung tumour
DSC70.9
30
SegmentationLiver tumour
DSC69.6
30
SegmentationGallbladder cancer
DSC64.9
15
Pan-cancer SegmentationInternal datasets
Lung Tumor DSC52.1
14
Pan-cancer SegmentationHealthy Datasets CHAOS, TCIA, Atlas
CHAOS Score45
10
Pan-cancer SegmentationCorona COVID-19 (External)
DSC58.1
10
Pan-cancer SegmentationIRCADb liver tumors (External)
DSC0.527
10
Pan-cancer SegmentationExternal Datasets Rider, Corona, IRCADb Average
Average DSC (%)45
10
Pan-cancer ScreeningFLARE 2023
DSC34.6
10
Pan-cancer SegmentationRider lung tumors (External)
DSC (%)24.2
10
Showing 10 of 27 rows

Other info

Follow for update