Astra: a generalizable report generation foundation model for 3D computed tomography

About

Interpreting computed tomography (CT) requires review of hundreds of volumetric slices and remains time-intensive and expertise-dependent. Automated CT report generation offers a promising route to improving clinical efficiency, yet the field still lacks a generalizable CT report generation foundation model that supports multi-region reporting and remains robust across external real-world cohorts. Intrinsic inconsistencies in reporting style and diagnostic terminology across cohorts make naive joint training difficult. Here we present Astra, a generalizable CT report generation foundation model developed on 90,678 thoracoabdominal CT-report pairs collected from five sites worldwide (CTRgDB), comprising 353,671 abnormalities spanning eight organ systems. By harmonizing report style and further refining diagnostic consistency via reinforcement learning, Astra achieves style-consistent and diagnostically accurate report generation across diverse anatomical regions and institutions. Evaluated on CTRgDB and six external cohorts, Astra achieves state-of-the-art performance with a 38.4% average improvement in fine-grained diagnostic metrics (P<0.001). Deployed at external clinical sites without any site-specific fine-tuning, Astra accelerated chest report drafting by 29.6% and improved abdominal report completeness by 11.3% among junior and mid-level radiologists (P<0.001). Furthermore, Astra demonstrates broad utility as a foundation for CT AI development, improving downstream diagnostic performance and scaling vision-language pretrain through high-quality report synthesis. Overall, Astra serves as a broadly accessible clinical assistant and a pivotal infrastructure for the next generation of AI-powered healthcare. The code for Astra is publicly available at https://github.com/zh-Wang-Med/Astra.

Zhuhao Wang, Fang Chen, Chaohui Yu, Zihan Li, Yuchao Zheng, Jing Wang, Xuan Yang, Jia Guo, Zhenlu Yang, Xingju Zheng, Yihua Sun, Haojie Han, Xiaoxiao Qin, Zhan Feng, Wenbo Xiao, Chao Zhu, Yuehua Li, Shipeng Zhang, Hao Luo, Yunsong Peng, Fan Wang, Hongen Liao• 2026

Related benchmarks

Task	Dataset	Result
Report Generation	CT-RATE	--	26
Classification	CT-RATE (test)	Micro Precision57.89	14
Fine-grained captioning	INSPECT	RaTE Score33.05	14
Medical image captioning	CT-RATE (test)	RaTE Score0.351	14
Medical Image Classification	BIMCV (test)	Micro Precision35.72	14
Medical Report Generation	BIMCV n=1,505 cases (test)	RaTE Score0.2624	14
Medical Report Generation	MERLIN (test)	RaTE Score35.64	14
Natural language generation	INSPECT	BLEU-10.4622	14
Natural language generation	BIMCV	BLEU-140.2	14
Natural language generation	Merlin	BLEU-10.3898	14

Showing 10 of 24 rows

Other info

Follow for update

@wizwand_team Discord