Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

EXPERTQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Fact-checkingExpertQA
Balanced Accuracy60.3
15
Attributable Text GenerationExpertQA v1 (test)
AutoAIS0.6612
9
Question AnsweringEXPERTQA (test)
Claim Recall19.27
6
Retrieval-Augmented GenerationExpertQA
Faithfulness73.9
5
Medical Long-form AnsweringExpertQA Biomed
Relevance3.7
4
Showing 5 of 5 rows