Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MultifieldQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringMultifieldQA
F1 Score57.2
52
Question AnsweringMultiFieldQA en
EM24.67
38
Question AnsweringMultiFieldQA-zh
F1 Score35.25
30
Long Context Question AnsweringMultiFieldQA
Accuracy57.33
15
Speculative DecodingMultiFieldQA
Speculative Rate (SR)2.1
12
Long-context Question AnsweringMultifieldQA
C Score88.6
9
Long-context answering with citationsMultifieldQA
Citation Recall79
9
Question AnsweringMultiFieldQA
Rel. Perf vs Truncated ICL1.041
5
Showing 8 of 8 rows