Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MedGemma vs GPT-4: Open-Source and Proprietary Zero-shot Medical Disease Classification from Images

About

Multimodal Large Language Models (LLMs) introduce an emerging paradigm for medical imaging by interpreting scans through the lens of extensive clinical knowledge, offering a transformative approach to disease classification. This study presents a critical comparison between two fundamentally different AI architectures: the specialized open-source agent MedGemma and the proprietary large multimodal model GPT-4 for diagnosing six different diseases. The MedGemma-4b-it model, fine-tuned using Low-Rank Adaptation (LoRA), demonstrated superior diagnostic capability by achieving a mean test accuracy of 80.37% compared to 69.58% for the untuned GPT-4. Furthermore, MedGemma exhibited notably higher sensitivity in high-stakes clinical tasks, such as cancer and pneumonia detection. Quantitative analysis via confusion matrices and classification reports provides comprehensive insights into model performance across all categories. These results emphasize that domain-specific fine-tuning is essential for minimizing hallucinations in clinical implementation, positioning MedGemma as a sophisticated tool for complex, evidence-based medical reasoning.

Md. Sazzadul Islam Prottasha, Nabil Walid Rafi• 2025

Related benchmarks

TaskDatasetResultRank
Medical Image ClassificationSC (Skin Cancer) (test)
Accuracy79.05
33
Medical Image ClassificationAlzheimer’s Disease (test)
Accuracy80.44
2
Medical Image ClassificationBreast Cancer (test)
Accuracy81.11
2
Medical Image ClassificationCardiovascular Disease (test)
Accuracy79.34
2
Medical Image ClassificationPneumonia (test)
Accuracy81.71
2
Medical Image ClassificationChronic Kidney Disease (test)
Accuracy80.57
2
Medical Image ClassificationSkin Cancer (train)
Accuracy82.2
1
Medical Image ClassificationSkin Cancer (val)
Accuracy80.4
1
Medical Image ClassificationAlzheimer’s Disease (train)
Accuracy85.1
1
Medical Image ClassificationAlzheimer’s Disease (val)
Accuracy82.3
1
Showing 10 of 18 rows

Other info

Follow for update