Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agentic tasks on BrowseComp
Loading...
9.45
Accuracy
MAS-ZERO
3.7196
5.2073
6.695
8.1827
May 21, 2025
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
MAS-ZERO
Backbone LLM=GPT-4o, S...
2025.05
9.45
CoT-SC
Backbone LLM=GPT-4o, S...
2025.05
8.66
Self-Refine
Backbone LLM=GPT-4o, S...
2025.05
5.51
CoT
Backbone LLM=GPT-4o, S...
2025.05
3.97
Debate
Backbone LLM=GPT-4o, S...
2025.05
3.94
Feedback
Search any
task
Search any
task