ClinTutor-R1: Advancing Scalable and Robust One-to-Many Alignment in Clinical Socratic Education

About

While Large Language Models (LLMs) have achieved remarkable success in dyadic (one-on-one) instruction, they face significant challenges in One-to-Many alignment, such as clinical ward rounds, where an instructor must simultaneously guide a diverse group of trainees. Current models often suffer from context dilution and goal misalignment, failing to balance individual scaffolding with collective learning progress. To address this, we introduce ClinEdu, a multi-agent pedagogical simulator that models the complexity of group dynamics. Leveraging this platform, we construct ClinTeach, a large-scale dataset of Socratic teaching dialogues, and propose ClinTutor-R1, the first vision-language agent explicitly architected to achieve one-to-many alignment in clinical education, employing an explicit internal thinking mechanism to model both individual belief states and group consensus. We validate our framework through a comprehensive protocol covering static benchmarks, in-situ interactive evaluation within ClinEdu, expert assessment, and a 200-participant real user study. Experimental results demonstrate that ClinTutor-R1 outperforms base models by over 20% and achieves parity with proprietary models, while exhibiting scalability in maintaining instructional quality across expanding student cohorts.

Zhitao He, Haolin Yang, Zeyu Qin, Yi R Fung• 2025

Related benchmarks

Task	Dataset	Result
Medical Visual Question Answering	PMC-VQA	Accuracy56.3	103
Medical Visual Question Answering	MedXpertQA	Accuracy25.1	44
Medical Question Answering	MedXpertQA (test)	ETS Score8.33	23
Medical Question Answering	MVME (test)	ETS8.41	23
Medical Visual Question Answering	MMMU	Accuracy58.82	19

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord