Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DetectiveQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Long-context Question AnsweringDetectiveQA-En
Accuracy75.5
32
Long-context Question AnsweringDetectiveQA-Zh
Accuracy0.8417
32
RetrievalDetectiveQA
Recall@332.22
8
RetrievalDetectiveQA-ZH
R@346.8
6
Question AnsweringDetectiveQA
Accuracy67.25
6
Showing 5 of 5 rows