Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SciBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Scientific Problem SolvingSciBench-107
Atkins Score62.5
24
Uncertainty CalibrationSciBench
AUROC78.7
18
Scientific Problem SolvingSciBench
Pass@2077.46
17
Scientific ReasoningSciBench
Score28.52
17
Scientific ReasoningSciBench
Pass@1072.23
6
Showing 5 of 5 rows