Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Medical Report Generation on Med-R2 (test)
Loading...
14.6
ROUGE-1
Hulu-Med-7B
7.528
9.364
11.2
13.036
May 23, 2026
ROUGE-1
BERTScore
Updated 8d ago
Evaluation Results
Method
Method
Links
ROUGE-1
BERTScore
Hulu-Med-7B
Model Category=Medical...
2026.05
14.6
83.6
Hulu-Med-14B
Model Category=Medical...
2026.05
14.5
86.3
Hulu-Med-4B
Model Category=Medical...
2026.05
13.7
83.3
Linshu-7B
Model Category=Medical...
2026.05
12.7
82
HuatuoGPT-Vision-34B
Model Category=Medical...
2026.05
12.5
83.1
Medgemma-4B
Model Category=Medical...
2026.05
10.9
81.1
Janus-Pro-7B
Model Category=Open So...
2026.05
10.3
81.9
Intern3-VL3-8B
Model Category=Open So...
2026.05
10.3
82.6
GPT-5.2-thinking
Model Category=Proprie...
2026.05
9.9
81.9
Qwen3-VL-8B
Model Category=Open So...
2026.05
8.6
81.6
GPT-4o
Model Category=Proprie...
2026.05
8.6
82.3
HuatuoGPT-Vision-7B
Model Category=Medical...
2026.05
8.5
81.5
Qwen3-VL-235b
Model Category=Proprie...
2026.05
7.9
81.2
Qwen2.5-VL-7B
Model Category=Open So...
2026.05
7.8
80.1
Feedback
Search any
task
Search any
task