Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Medical Long-form Answering on LiveQA
Loading...
4.1
Relevance
No Retrieval
3.892
3.946
4
4.054
Jan 5, 2025
Relevance
Completeness
Proficiency
Interpretation
Updated 4d ago
Evaluation Results
Method
Method
Links
Relevance
Completeness
Proficiency
Interpretation
No Retrieval
Reader=Frozen Qwen2.5-7B
2025.01
4.1
3.3
2.8
-
Original Question
Reader=Frozen Qwen2.5-7B
2025.01
4.1
3.6
3.5
3
SPO Planning
Reader=Frozen Qwen2.5-7B
2025.01
4.1
4.3
4.2
3.8
SERTS
Reader=Frozen Qwen2.5-7B
2025.01
3.9
3.3
3.6
3.4
Feedback
Search any
task
Search any
task