Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Deep Research on ScholarQA CS
Loading...
70.5
Average Score
GEPA custom
50.532
55.716
60.9
66.084
Apr 3, 2026
Average Score
Updated 12d ago
Evaluation Results
Method
Method
Links
Average Score
GEPA custom
Prompt Initialization=...
2026.04
70.5
GEPA custom
Prompt Initialization=...
2026.04
70.1
GEPA
Prompt Initialization=...
2026.04
68.5
TextGrad
Prompt Initialization=...
2026.04
67.2
GEPA
Prompt Initialization=...
2026.04
67
Expert Prompt Baseline
Prompt Initialization=...
2026.04
66.7
OpenAI
Prompt Initialization=...
2026.04
66.7
TextGrad
Prompt Initialization=...
2026.04
65.4
OpenAI
Prompt Initialization=...
2026.04
58.3
Minimal Prompt Baseline
Prompt Initialization=...
2026.04
51.3
Feedback
Search any
task
Search any
task