Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Clinical order generation on MedChain (test)
Loading...
37.88
Precision
MedResearcher-R1
18.0888
23.2269
28.365
33.5031
May 31, 2026
Precision
Recall
F1 Score
Updated 1d ago
Evaluation Results
Method
Method
Links
Precision
Recall
F1 Score
MedResearcher-R1
Category=Agentic Reaso...
2026.05
37.88
40.09
30.31
Tongyi DeepResearch
Category=Agentic Reaso...
2026.05
32.15
35.79
24.13
ReflecTool
Category=Single-Agent...
2026.05
28.73
38.96
29.91
MDAgents
Category=Multi-Agent M...
2026.05
27.62
45.18
30.3
CAREAgent
Category=Agentic Reaso...
2026.05
26.78
51.73
31.75
ReAct
Category=Single-Agent...
2026.05
26.69
23.03
21.84
AgentClinic
Category=Multi-Agent M...
2026.05
21.13
21.25
18.97
MedAgents
Category=Multi-Agent M...
2026.05
18.85
36.28
21.82
Feedback
Search any
task
Search any
task