Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agentic Search on BrowseComp-Plus
Loading...
66.5
Accuracy
Solution Aggregation
48.82
53.41
58
62.59
May 30, 2026
Accuracy
Updated 1d ago
Evaluation Results
Method
Method
Links
Accuracy
Solution Aggregation
Model=Gemini-3-flash,...
2026.05
66.5
FINEVERIFY
Model=Gemini-3-flash,...
2026.05
66.5
Weighted Voting
Model=Gemini-3-flash,...
2026.05
64
Best-of-N
Model=Gemini-3-flash,...
2026.05
63.5
Confidence Verify
Model=Gemini-3-flash,...
2026.05
63
Majority Voting
Model=Gemini-3-flash,...
2026.05
62.5
FINEVERIFY
Model=GPT-5-mini, Samp...
2026.05
60.5
Pass@1
Model=Gemini-3-flash,...
2026.05
60.5
Best-of-N
Model=GPT-5-mini, Samp...
2026.05
59.5
Solution Aggregation
Model=GPT-5-mini, Samp...
2026.05
59.5
Confidence Verify
Model=GPT-5-mini, Samp...
2026.05
58
Weighted Voting
Model=GPT-5-mini, Samp...
2026.05
57.5
Pass@1
Model=GPT-5-mini, Samp...
2026.05
49.5
Majority Voting
Model=GPT-5-mini, Samp...
2026.05
49.5
Feedback
Search any
task
Search any
task