Human or LLM as Standardized Patients? A Comparative Study for Medical Education

About

Standardized patients (SPs) are indispensable for clinical skills training but remain expensive and difficult to scale. Although large language model (LLM)-based virtual standardized patients (VSPs) have been proposed as an alternative, their behavior remains unstable and lacks rigorous comparison with human standardized patients. We propose EasyMED, a multi-agent VSP framework that separates case-grounded information disclosure from response generation to support stable, inquiry-conditioned patient behavior. We also introduce SPBench, a human-grounded benchmark with eight expert-defined criteria for interaction-level evaluation. Experiments show that EasyMED more closely matches human SP behavior than existing VSPs, particularly in case consistency and controlled disclosure. A four-week controlled study further demonstrates learning outcomes comparable to human SP training, with stronger early gains for novice learners and improved flexibility, psychological safety, and cost efficiency.

Bingquan Zhang, Xiaoxiao Liu, Yuchi Wang, Lei Zhou, Qianqian Xie, Benyou Wang• 2025

Related benchmarks

Task	Dataset	Result
Virtual Standardized Patient Simulation	SPBench	QC Score97.17	9
Clinical Skill Acquisition	OSCE Pre (test)	Mean Score70.56	2
Clinical Skill Acquisition	OSCE Mid-test	Mean Score86.07	2
Clinical Skill Acquisition	OSCE Post-test	Mean Score87.44	2
Clinical Skill Acquisition	OSCE Phase 1 (Weeks 1-2)	Mean Score Gain15.51	2
Clinical Skill Acquisition	OSCE Weeks 3-4 Phase 2	Mean Score Gain3.19	2
Clinical Skill Acquisition	OSCE Total Gain	Mean Score Gain16.89	2

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord