Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MultifieldQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringMultifieldQA
F1 Score57.2
52
Long Context Question AnsweringMultiFieldQA
Accuracy57.33
15
Speculative DecodingMultiFieldQA
Speculative Rate (SR)2.1
12
Long-context Question AnsweringMultifieldQA
C Score88.6
9
Long-context answering with citationsMultifieldQA
Citation Recall79
9
Question AnsweringMultiFieldQA
Rel. Perf vs Truncated ICL1.041
5
Showing 6 of 6 rows