Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Medical Reasoning Repair on MedBrowse (test)
Loading...
170
Pass Count
Direct
148.16
153.83
159.5
165.17
May 25, 2026
Pass Count
Fail Count
Repair Count
Minimum Efficiency Score
Updated 8d ago
Evaluation Results
Method
Method
Links
Pass Count
Fail Count
Repair Count
Minimum Efficiency Score
Direct
Total=484
2026.05
170
314
0
1
Self-Refine
Total=484
2026.05
169
315
12
0.82
Self-Reflection
Total=484
2026.05
169
315
21
0.65
CausalFlow
Total=484
2026.05
149
335
149
0.84
Feedback
Search any
task
Search any
task