Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Tool Use on BrowseComp Domain-specific (9) Search
Loading...
22.5
Accuracy
Gold Oracle
2.012
7.331
12.65
17.969
Feb 16, 2026
Accuracy
Task Completion Count
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Task Completion Count
Gold Oracle
Model=GPT-5, Strategy=...
2026.02
22.5
7
TOOLOBSERVER
Model=GPT-5, Strategy=...
2026.02
21.9
6.3
EasyTool
Model=GPT-5, Strategy=...
2026.02
18.9
8.2
Base ReAct
Model=GPT-5, Strategy=...
2026.02
18.1
10
Play2Prompt
Model=GPT-5, Strategy=...
2026.02
17.9
7.8
Gold Oracle
Model=GPT-5-mini, Stra...
2026.02
9.7
2.8
EasyTool
Model=GPT-5-mini, Stra...
2026.02
7.1
3.8
TOOLOBSERVER
Model=GPT-5-mini, Stra...
2026.02
3.2
3.5
Base ReAct
Model=GPT-5-mini, Stra...
2026.02
3.1
4.1
Play2Prompt
Model=GPT-5-mini, Stra...
2026.02
2.8
3.5
Feedback
Search any
task
Search any
task