Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DeepResearch

Benchmarks

Task NameDataset NameSOTA ResultTrend
Deep ResearchDeepResearch Bench
RACE Overall53.08
22
Judge Agreement AccuracyDeepResearch 1319 queries (test)
Agreement Accuracy74.5
19
Long-form deep researchDeepResearch Bench (test)
Overall Score48.24
13
Showing 3 of 3 rows