Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Autonomous Exploration and Discovery on Bottom-up setting
Loading...
70.4
Numeric Grounding
Nomad
15.176
29.513
43.85
58.187
Mar 31, 2026
Numeric Grounding
Factuality
Quality (Overall)
Quality (Analytical)
Quality (Coverage)
Quality (Actionability)
Quality (Presentation)
Intra-report Distinctness
Inter-report Diversity
Updated 18d ago
Evaluation Results
Method
Method
Links
Numeric Grounding
Factuality
Quality (Overall)
Quality (Analytical)
Quality (Coverage)
Quality (Actionability)
Quality (Presentation)
Intra-report Distinctness
Inter-report Diversity
Nomad
# Reports=50
2026.03
70.4
65.2
61
59.46
68.07
49.15
84.54
43.73
0.4273
o3-deep-research
# Reports=44
2026.03
34.9
40.2
54.2
52.82
80.6
30.63
86.61
72.9
0.2084
GPTResearcher
# Reports=57
2026.03
17.3
28.5
42.5
46.04
57.87
34.85
83.54
66.7
0.0882
Feedback
Search any
task
Search any
task