Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

HeadQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multilingual Multiple-Choice Question AnsweringHeadQA 1.0 (test)
Chinese Acc88.88
56
Medical Question AnsweringHeadQA
Accuracy92.2
30
Question AnsweringHeadQA English
Accuracy40.1
25
Question AnsweringHeadQA
Accuracy64.9
14
ReasoningHeadQA
Pass@177.9
10
Question AnsweringHeadQA
Pass@178.12
8
Question AnsweringHeadQA
Accuracy51.1
2
Showing 7 of 7 rows