TeamMedAgents: Pareto-Efficient Multi-Agent Medical Reasoning Through Teamwork Theory
About
Complex medical reasoning has historically required frontier language models to achieve clinically-acceptable accuracy, creating computational barriers that limit deployment in resource-constrained clinical settings. We present TeamMedAgents, a modular multi-agent framework that translates Salas et al.'s evidence-based teamwork theory into computational mechanisms--shared mental models, team leadership, team orientation, trust networks, and mutual monitoring--enabling Small Language Models to perform multi-step clinical reasoning efficiently. Evaluation across 8 medical benchmarks demonstrates that TeamMedAgents advances the Pareto efficiency frontier by 1-2 orders of magnitude, achieving competitive accuracy at substantially lower token cost than MDAgents, MedAgents, DyLAN, and ReConcile. The framework exhibits the lowest cross-dataset variance among multi-agent approaches, enabling deployment without per-task tuning. Our results establish that theory-grounded coordination mechanisms provide essential scaffolding for deploying efficient medical AI in resource-constrained clinical environments.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Medical Question Answering | MedMCQA | Accuracy51.7 | 346 | |
| Medical Question Answering | MedQA | Accuracy88.1 | 153 | |
| Question Answering | PubMedQA | Accuracy79.2 | 145 | |
| Medical Question Answering | PubMedQA | Accuracy68.7 | 92 | |
| Medical Visual Question Answering | PMC-VQA | Accuracy56.4 | 74 | |
| Medical Question Answering | Medbullets | Accuracy80.3 | 65 | |
| Multi-task Language Understanding | MMLU-Pro | Accuracy31.7 | 55 | |
| Medical Visual Question Answering | PathVQA | Accuracy76.8 | 50 | |
| Medical Question Answering | DDXPlus | Accuracy82.4 | 43 | |
| Vision-Language Medical Reasoning | PathVQA | Token Cost (tokens/question)3.65e+3 | 29 |