ECG-Agent: On-Device Tool-Calling Agent for ECG Multi-Turn Dialogue
About
Recent advances in Multimodal Large Language Models have rapidly expanded to electrocardiograms, focusing on classification, report generation, and single-turn QA tasks. However, these models fall short in real-world scenarios, lacking multi-turn conversational ability, on-device efficiency, and precise understanding of ECG measurements such as the PQRST intervals. To address these limitations, we introduce ECG-Agent, the first LLM-based tool-calling agent for multi-turn ECG dialogue. To facilitate its development and evaluation, we also present ECG-Multi-Turn-Dialogue (ECG-MTD) dataset, a collection of realistic user-assistant multi-turn dialogues for diverse ECG lead configurations. We develop ECG-Agents in various sizes, from on-device capable to larger agents. Experimental results show that ECG-Agents outperform baseline ECG-LLMs in response accuracy. Furthermore, on-device agents achieve comparable performance to larger agents in various evaluations that assess response accuracy, tool-calling ability, and hallucinations, demonstrating their viability for real-world applications.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Dialogue Quality Assessment | ECG-MTD | Naturalness3.98 | 7 | |
| ECG dialogue response quality evaluation | ECG-MTD 12-lead (test) | Accuracy3.54 | 7 | |
| ECG dialogue response quality evaluation | ECG-MTD Lead I (test) | Accuracy3.76 | 7 | |
| ECG dialogue response quality evaluation | ECG-MTD Lead II (test) | Accuracy3.88 | 7 | |
| Direct response evaluation | ECG-MTD (test) | Accuracy3.78 | 7 | |
| Next Action Prediction | ECG-MTD 12 Leads (test) | NAP Accuracy (w/o GT)94.95 | 5 | |
| Next Action Prediction | ECG-MTD Lead I (test) | NAP Accuracy (w/o GT)93.03 | 5 | |
| Next Action Prediction | ECG-MTD Lead II (test) | NAP Accuracy (w/o GT)92.16 | 5 | |
| Faithfulness | ECG-MTD 12 Leads (test) | Faithfulness88.98 | 4 | |
| Faithfulness | ECG-MTD Lead I (test) | Faithfulness93.04 | 4 |