Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Human Evaluation on DeepResearch Bench 20 reports (sampled)

95Readability (Win/Tie Rate)

PTAH

84.687.39092.7May 28, 2026
Updated 5d ago

Evaluation Results

MethodLinks
2026.05
95909595
2026.05
9095100100
2026.05
88.7588.7596.2595
2026.05
85909595
2026.05
85809590