Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

QA-FEEDBACK

Benchmarks

Task NameDataset NameSOTA ResultTrend
Machine Reading ComprehensionQA-Feedback (test)
Relevance44.9
22
Question Answering FeedbackQA-Feedback
AUC ROC0.651
5
Question Answering Feedback GenerationQA-FEEDBACK (test)
Rs1 (Relevance)0.513
4
Information CompletenessQA-FEEDBACK (test)
Win Rate23
3
Showing 4 of 4 rows