Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Medical Long-form Answering on ExpertQA Biomed
Loading...
3.7
Relevance
Original Question
3.596
3.623
3.65
3.677
Jan 5, 2025
Relevance
Completeness
Proficiency
Interpretation
Updated 4d ago
Evaluation Results
Method
Method
Links
Relevance
Completeness
Proficiency
Interpretation
Original Question
Reader=Frozen Qwen2.5-7B
2025.01
3.7
3.6
3.6
3.7
No Retrieval
Reader=Frozen Qwen2.5-7B
2025.01
3.6
3.4
2.7
-
SERTS
Reader=Frozen Qwen2.5-7B
2025.01
3.6
3.3
2.7
3.1
SPO Planning
Reader=Frozen Qwen2.5-7B
2025.01
3.6
4.3
4.1
4
Feedback
Search any
task
Search any
task