Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Astra: a generalizable report generation foundation model for 3D computed tomography

About

CT interpretation requires radiologists to review hundreds of volumetric slices per examination, making reporting time-consuming and highly expertise-dependent. Automated CT report generation offers a promising route to improving clinical efficiency, yet the field still lacks a generalizable CT report generation foundation model that supports multi-region reporting and remains robust across external real-world cohorts. Intrinsic inconsistencies in reporting style and diagnostic terminology across cohorts make naive joint training prone to noisy textual supervision, thereby limiting model generalizability. Here we present Astra, a generalizable CT report generation foundation model trained on 90,678 thoracoabdominal CT-report pairs (CTRgDB) with 353,671 abnormalities spanning eight organ systems. By harmonizing report style and further refining diagnostic consistency via reinforcement learning, Astra achieves style-consistent and diagnostically accurate report generation across diverse anatomical regions and institutions. Evaluating on CTRgDB and six external cohorts, Astra achieves state-of-the-art performance with a 44.1% average improvement in fine-grained diagnostic metrics (P<0.001). In real-world clinical workflows, Astra assistance accelerates chest report drafting by 29.6% and improves abdominal report completeness by 11.3% (P<0.001). Furthermore, Astra also demonstrates broad utility as a foundation for CT AI development, improving downstream diagnostic performance and scaling vision-language pretrain through high-quality report synthesis. Overall, Astra serves as a broadly accessible clinical assistant and a pivotal infrastructure for the next generation of AI-powered healthcare.

Zhuhao Wang, Fang Chen, Chaohui Yu, Zihan Li, Yuchao Zheng, Jing Wang, Xuan Yang, Jia Guo, Zhenlu Yang, Xingju Zheng, Yihua Sun, Haojie Han, Xiaoxiao Qin, Zhan Feng, Wenbo Xiao, Chao Zhu, Yuehua Li, Shipeng Zhang, Hao Luo, Yunsong Peng, Fan Wang, Hongen Liao• 2026

Related benchmarks

TaskDatasetResultRank
Report GenerationCT-RATE--
26
ClassificationCT-RATE (test)
Micro Precision57.89
14
Fine-grained captioningINSPECT
RaTE Score33.05
14
Medical image captioningCT-RATE (test)
RaTE Score0.351
14
Medical Image ClassificationBIMCV (test)
Micro Precision35.72
14
Medical Report GenerationBIMCV n=1,505 cases (test)
RaTE Score0.2624
14
Medical Report GenerationMERLIN (test)
RaTE Score35.64
14
Natural language generationINSPECT
BLEU-10.4622
14
Natural language generationBIMCV
BLEU-140.2
14
Natural language generationMerlin
BLEU-10.3898
14
Showing 10 of 24 rows

Other info

Follow for update