Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Deep Research Report Generation on Top-down setting 1.0 (test)
Loading...
66.3
Numeric Grounding
Nomad
45.396
50.823
56.25
61.677
Mar 31, 2026
Numeric Grounding
Factuality
Quality (Overall)
Quality (Analytical)
Quality (Coverage)
Quality (Actionability)
Quality (Presentation)
Intra-report Distinctness
Inter-report Diversity
Updated 18d ago
Evaluation Results
Method
Method
Links
Numeric Grounding
Factuality
Quality (Overall)
Quality (Analytical)
Quality (Coverage)
Quality (Actionability)
Quality (Presentation)
Intra-report Distinctness
Inter-report Diversity
Nomad
# Reports=18
2026.03
66.3
65.9
62.6
58.36
72.76
43.83
85.18
52.64
0.4697
o3-deep-research
# Reports=18
2026.03
64.1
73.3
54.9
58.11
81.83
28.44
87.5
75.1
0.1599
GPTResearcher
# Reports=18
2026.03
46.2
52.2
47.3
45.15
58.88
31.96
87.36
72.7
0.0836
Feedback
Search any
task
Search any
task