Share your thoughts, 1 month free Claude Pro on usSee more

High-level Diagnostic Reasoning on EndoAgentBench

76.03CAP (CAR)

GPT-4o

Updated 2mo ago

Evaluation Results

Method	Links
GPT-4o 2025.08		76.03	50.19	63.1
EndoCogniAgent 2025.08		64	78.24	71.13
Step-1o 2025.08		61.18	54.69	57.93
Gemini 2.5 Pro 2025.08		53.29	35.83	44.55
ColonGPT 2025.08		52.91	20.54	36.71
Yi-Vision 2025.08		47.09	25.52	36.29
HuatuoGPT 2025.08		41.07	20.54	30.8
LLaVA-Med 2025.08		3.2	2.25	2.72