Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Research Task on DeepResearchBench (DRB)

95.9Accuracy (%)

GPT 5.4

-3.83622.05747.9573.843May 16, 2026
Updated 15d ago

Evaluation Results

MethodLinks
2026.05
95.9-
2026.05
92.516.7
2026.05
90.1-
2026.05
85.2-
2026.05
80.5-
2026.05
80.5-
2026.05
79.4-
2026.05
77.6-
2026.05
75.8-
2026.05
75.8-
2026.05
72.1-
2026.05
63.113.1
2026.05
63-
2026.05
50-
2026.05
42-
2026.05
38.215.6
2026.05
22.6-
2026.05
22.6-
2026.05
15.615.6
2026.05
10.5-
2026.05
0-