Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

OfficeQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Office Application Question AnsweringOfficeQA held-out (test)
Score (%)72.1
59
Question AnsweringOfficeQA
Accuracy82.5
25
Question AnsweringOfficeQA 246 questions
Accuracy80.1
15
Long-context reasoningOfficeQA
Accuracy57.14
10
Showing 4 of 4 rows