Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Scientific Reasoning

Benchmarks

Task NameDataset NameSOTA ResultTrend
Scientific ReasoningScientific Reasoning Subset A
ROUGE-L14.7
8
Scientific ReasoningScientific Reasoning Domain Average
Accuracy65.2
4
Showing 2 of 2 rows