Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Clinical Order Generation on Private dataset (test)
Loading...
22.92
Precision
CAREAgent
14.3504
16.5752
18.8
21.0248
May 31, 2026
Precision
Recall
F1 Score
Updated 1d ago
Evaluation Results
Method
Method
Links
Precision
Recall
F1 Score
CAREAgent
Category=Agentic Reaso...
2026.05
22.92
39.81
23.75
MedResearcher-R1
Category=Agentic Reaso...
2026.05
21.63
33.34
21.19
ReflecTool
Category=Single-Agent...
2026.05
21.47
29.15
21.27
MDAgents
Category=Multi-Agent M...
2026.05
19.12
32.5
19.79
MedAgents
Category=Multi-Agent M...
2026.05
18.29
30.19
18.69
Tongyi DeepResearch
Category=Agentic Reaso...
2026.05
17.92
30.34
18.71
ReAct
Category=Single-Agent...
2026.05
17.39
17.61
14.53
AgentClinic
Category=Multi-Agent M...
2026.05
14.68
16.32
12.86
Feedback
Search any
task
Search any
task