Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Long-form QA on MedicationQA (test)
Loading...
27.73
ROUGE-1
Fine-Tuned GPT-4o + MedBioRAG
19.5348
21.6624
23.79
25.9176
Dec 10, 2025
ROUGE-1
ROUGE-2
ROUGE-L
BLEU
BERTScore
BLEURT
Updated 4d ago
Evaluation Results
Method
Method
Links
ROUGE-1
ROUGE-2
ROUGE-L
BLEU
BERTScore
BLEURT
Fine-Tuned GPT-4o + MedBioRAG
Fine-tuned=true, MedBi...
2025.12
27.73
15.09
22.72
7.24
8.79
-33.63
Fine-Tuned GPT-4o
Fine-tuned=true, MedBi...
2025.12
24.69
8.8
17.61
2.49
8.98
-33.82
GPT-4o
Fine-tuned=false, MedB...
2025.12
22.92
13.69
18.7
7.89
8.55
-6.92
GPT-4o + MedBioRAG
Fine-tuned=false, MedB...
2025.12
19.85
4.2
10.97
0.98
-7.63
-33.21
Feedback
Search any
task
Search any
task