Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Medical Long-form Answering on MedicationQA
Loading...
3.9
Relevance
SERTS
3.484
3.592
3.7
3.808
Jan 5, 2025
Relevance
Completeness
Proficiency
Interpretation
Updated 1mo ago
Evaluation Results
Method
Method
Links
Relevance
Completeness
Proficiency
Interpretation
SERTS
Reader=Frozen Qwen2.5-7B
2025.01
3.9
3.8
3.6
3
No Retrieval
Reader=Frozen Qwen2.5-7B
2025.01
3.7
3.6
3.6
-
SPO Planning
Reader=Frozen Qwen2.5-7B
2025.01
3.7
4
4
4.1
Original Question
Reader=Frozen Qwen2.5-7B
2025.01
3.5
3.5
3.3
3.2
Feedback
Search any
task
Search any
task