A generalizable foundation model for intraoperative understanding across surgical procedures

About

In minimally invasive surgery, clinical decisions depend on real-time visual interpretation, yet intraoperative perception varies substantially across surgeons and procedures. This variability limits consistent assessment, training, and the development of reliable artificial intelligence systems, as most surgical AI models are designed for narrowly defined tasks and do not generalize across procedures or institutions. Here we introduce ZEN, a generalizable foundation model for intraoperative surgical video understanding trained on more than 4 million frames from over 21 procedures using a self-supervised multi-teacher distillation framework. We curated a large and diverse dataset and systematically evaluated multiple representation learning strategies within a unified benchmark. Across 20 downstream tasks and full fine-tuning, frozen-backbone, few-shot and zero-shot settings, ZEN consistently outperforms existing surgical foundation models and demonstrates robust cross-procedure generalization. These results suggest a step toward unified representations for surgical scene understanding and support future applications in intraoperative assistance and surgical training assessment.

Kanggil Park, Yongjun Jeon, Soyoung Lim, Seonmin Park, Jongmin Shin, Jung Yong Kim, Sehyeon An, Jinsoo Rhu, Jongman Kim, Gyu-Seong Choi, Namkee Oh, Kyu-Hwan Jung• 2026

Related benchmarks

Task	Dataset	Result
Surgical Phase Recognition	Cholec80	--	70
Surgical Phase Recognition	MultiBypass140	Phase-level Precision0.7381	39
Depth Estimation	Hamlyn	Abs Rel0.1554	31
Surgical Phase Recognition	Cholec80 (test)	Precision82.03	28
Action Triplet Recognition	CholecT50	AP (I)86.11	27
Monocular Depth Estimation	SCARED	Abs Rel0.1306	27
Closed-ended Visual Question Answering	PitVQA	F1 Score60.43	26
Closed-ended Visual Question Answering	LLS48-VQA	F1 Score23.35	26
Instance Segmentation	Grasp	mAP (Mask)0.5597	26
Object Detection	Grasp	mAP (BBox)62.5	26

Showing 10 of 20 rows

Other info

Follow for update

@wizwand_team Discord