Echo-{\alpha}: Large Agentic Multimodal Reasoning Model for Ultrasound Interpretation
About
Ultrasound interpretation requires both precise lesion localization and holistic clinical reasoning, yet existing methods typically excel at only one of these capabilities: specialized detectors offer strong localization but limited reasoning, whereas multimodal large language models (MLLMs) provide flexible reasoning but weak grounding in specialized medical domains. We present Echo-{\alpha}, an agentic multimodal reasoning model for ultrasound interpretation that unifies these strengths within an invoke-and-reason framework. Echo-{\alpha} is trained to coordinate organ-specific detector outputs, integrate them with global visual context, and convert the resulting evidence into grounded diagnostic decisions beyond detector-only inference. This behavior is established through a nine-task supervised curriculum and then refined by sequential reinforcement learning under different reward trade-offs, yielding Echo-{\alpha}-Grounding for lesion anchoring and Echo-{\alpha}-Diagnosis for final diagnosis. On multi-center renal and breast ultrasound benchmarks, Echo-{\alpha} outperforms competitive baselines on both grounding and diagnosis. In particular, on cross-center test sets, Echo-{\alpha}-Grounding attains 56.73%/43.78% F1@0.5 and Echo- {\alpha}-Diagnosis reaches 74.90%/49.20% overall accuracy on renal/breast ultrasound. These results suggest that agentic multimodal reasoning can turn specialized detectors into verifiable clinical evidence, offering a practical route toward ultrasound AI systems that are more accurate, interpretable, and transferable. The repository is at https://github.com/MiliLab/Echo-Alpha.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Grounding | Renal Ultrasound cross-center (test) | F1 Score @ IoU=0.2569 | 12 | |
| Grounding | Breast Ultrasound same-center (val) | F1 Score @ IoU=0.2552.96 | 12 | |
| Ultrasound Diagnosis | breast ultrasound (test) | Overall Accuracy49.2 | 8 | |
| Ultrasound Diagnosis | Breast Ultrasound (val) | Overall Accuracy48.75 | 8 | |
| Grounding | Renal Ultrasound same-center (val) | F1@0.2573.38 | 6 | |
| Grounding | Breast Ultrasound cross-center (test) | F1 Score @ IoU=0.2545.72 | 6 | |
| Visual Grounding | breast ultrasound (test) | F1 Score (IoU=0.25)45.72 | 6 | |
| Diagnosis | Renal Ultrasound cross-center (test) | Accuracy74.9 | 4 | |
| Diagnosis | Renal Ultrasound (val) | Overall Accuracy77.43 | 4 | |
| Diagnosis | Renal Ultrasound (test) | Overall Accuracy74.9 | 4 |