Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MedForge: Interpretable Medical Deepfake Detection via Forgery-aware Reasoning

About

Text-guided image editors can now manipulate authentic medical scans with high fidelity, enabling lesion implantation/removal that threatens clinical trust and safety. Existing defenses are inadequate for healthcare. Medical detectors are largely black-box, while MLLM-based explainers are typically post-hoc, lack medical expertise, and may hallucinate evidence on ambiguous cases. We present MedForge, a data-and-method solution for pre-hoc, evidence-grounded medical forgery detection. We introduce MedForge-90K, a large-scale benchmark of realistic lesion edits across 19 pathologies with expert-guided reasoning supervision via doctor inspection guidelines and gold edit locations. Building on it, MedForge-Reasoner performs localize-then-analyze reasoning, predicting suspicious regions before producing a verdict, and is further aligned with Forgery-aware GSPO to strengthen grounding and reduce hallucinations. Experiments demonstrate state-of-the-art detection accuracy and trustworthy, expert-aligned explanations.

Zhihui Chen, Kai He, Qingyuan Lei, Bin Pu, Jian Zhang, Yuling Xu, Mengling Feng• 2026

Related benchmarks

TaskDatasetResultRank
Forgery DetectionMedForge Real Forgery (In-Domain) 90K
Accuracy99.24
18
Forgery DetectionMedForge-90K Real Forgery Cross-Forgery
Accuracy95.24
9
Forgery DetectionMedForge Real Forgery Cross-Model 90K
Accuracy92.86
9
Forgery DetectionMedForge-90K Implant Forgery Cross-Forgery
Accuracy93.39
9
Forgery DetectionMedForge-90K Implant Forgery Cross-Model
Accuracy94.86
9
Forgery DetectionMedForge-90K Remove Forgery (In-Domain)
Accuracy99.21
9
Forgery DetectionMedForge-90K Remove Forgery Cross-Forgery
Accuracy99.15
9
Forgery DetectionMedForge-90K Remove Forgery Cross-Model
Accuracy94.09
9
Showing 8 of 8 rows

Other info

Follow for update