Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
MCQ Diagnostic Accuracy on Stanford CMR (test)
Loading...
88
Accuracy
MARCUS
42.24
54.12
66
77.88
Mar 23, 2026
Accuracy
95% CI (Lower Bound)
p-value
Updated 25d ago
Evaluation Results
Method
Method
Links
Accuracy
95% CI (Lower Bound)
p-value
MARCUS
N=50
2026.03
88
78
-
GPT-5
N=50
2026.03
58
44
-
Gemini 2.5 Pro
N=50
2026.03
44
30
-
MARCUS vs GPT-5 (McNemar p)
N=50
2026.03
-
-
0.001
MARCUS vs Gemini (McNemar p)
N=50
2026.03
-
-
0.001
Feedback
Search any
task
Search any
task