Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

RetinaLogos: Fine-Grained Synthesis of High-Resolution Retinal Images Through Captions

About

The scarcity of high-quality, labelled retinal imaging data, which presents a significant challenge in the development of machine learning models for ophthalmology, hinders progress in the field. Existing methods for synthesising Colour Fundus Photographs (CFPs) largely rely on predefined disease labels, which restricts their ability to generate images that reflect fine-grained anatomical variations, subtle disease stages, and diverse pathological features beyond coarse class categories. To overcome these challenges, we first introduce an innovative pipeline that creates a large-scale, captioned retinal dataset comprising 1.4 million entries, called RetinaLogos-1400k. Specifically, RetinaLogos-1400k uses the visual language model(VLM) to describe retinal conditions and key structures, such as optic disc configuration, vascular distribution, nerve fibre layers, and pathological features. Building on this dataset, we employ a novel three-step training framework, RetinaLogos, which enables fine-grained semantic control over retinal images and accurately captures different stages of disease progression, subtle anatomical variations, and specific lesion types. Through extensive experiments, our method demonstrates superior performance across multiple datasets, with 62.07% of text-driven synthetic CFPs indistinguishable from real ones by ophthalmologists. Moreover, the synthetic data improves accuracy by 5%-10% in diabetic retinopathy grading and glaucoma detection. Codes are available at https://github.com/uni-medical/retina-text2cfp.

Junzhi Ning, Cheng Tang, Kaijing Zhou, Diping Song, Lihao Liu, Ming Hu, Wei Li, Huihui Xu, Yanzhou Su, Tianbin Li, Jiyao Liu, Jin Ye, Sheng Zhang, Yuanfeng Ji, Junjun He• 2025

Related benchmarks

TaskDatasetResultRank
Glaucoma DetectionREFUGE 2.0 (test)--
40
Image SynthesisAPTOS
FID56.078
7
Image SynthesisEyePACS
Inception Score1.884
6
Image SynthesisAIROGS
Inception Score1.884
6
Diabetic Retinopathy Grading ClassificationIDRID Eval (test)
Accuracy63.75
4
Showing 5 of 5 rows

Other info

Code

Follow for update