Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Science & QA Domain

Benchmarks

Task NameDataset NameSOTA ResultTrend
Scientific Question AnsweringScience & QA Domain Out-of-Domain
SampleQA Score3.19
11
Scientific Reasoning & QAScience & QA Domain Multiple Datasets
Average Accuracy4.04
7
Showing 2 of 2 rows