Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
High-level Diagnostic Reasoning on EndoAgentBench
Loading...
76.03
CAP (CAR)
GPT-4o
0.2868
19.9509
39.615
59.2791
Aug 10, 2025
CAP (CAR)
MRG (CAR)
Average (CAR)
Updated 15d ago
Evaluation Results
Method
Method
Links
CAP (CAR)
MRG (CAR)
Average (CAR)
GPT-4o
Category=General MLLMs
2025.08
76.03
50.19
63.1
EndoCogniAgent
Category=Our Method
2025.08
64
78.24
71.13
Step-1o
Category=General MLLMs
2025.08
61.18
54.69
57.93
Gemini 2.5 Pro
Category=General MLLMs
2025.08
53.29
35.83
44.55
ColonGPT
Category=Medical MLLMs
2025.08
52.91
20.54
36.71
Yi-Vision
Category=General MLLMs
2025.08
47.09
25.52
36.29
HuatuoGPT
Category=Medical MLLMs
2025.08
41.07
20.54
30.8
LLaVA-Med
Category=Medical MLLMs
2025.08
3.2
2.25
2.72
Feedback
Search any
task
Search any
task