Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DeepResearchGym

Benchmarks

Task NameDataset NameSOTA ResultTrend
Deep ResearchDeepResearchGym (test)
Focus Stability80.67
10
Open-Ended Deep ResearchDeepResearchGym
Clarity90.71
9
Deep Research Report GenerationDeepResearchGym Commercial 100
KPR80.55
9
Showing 3 of 3 rows