Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Web agent task completion on GoBrowse
Loading...
90.4
Success Rate
CATTS (∆, best τ)
87.904
88.552
89.2
89.848
Feb 12, 2026
Success Rate
Tokens Used
Updated 1mo ago
Evaluation Results
Method
Method
Links
Success Rate
Tokens Used
CATTS (∆, best τ)
Gating=Margin, N=10
2026.02
90.4
372,000
CATTS (H, best τ)
Gating=Entropy, N=10,...
2026.02
90.2
422,000
Always-arbitrate
N=10
2026.02
88.3
443,000
Majority vote
N=10
2026.02
88
481,000
Feedback
Search any
task
Search any
task