Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Science Domain

Benchmarks

Task NameDataset NameSOTA ResultTrend
Scientific ReasoningScience Domain In-Domain: SampleQA, GPQA(ALL), HLE
SampleQA Score3.26
18
ReasoningScience Domain 20 tasks (test)
Total Cost (USD)0.11
3
Multi-agent task routingScience Domain 1.0 (test)
Total Cost0.59
2
Showing 3 of 3 rows