| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| BrowseComp | Accuracy73.33 | 52 | 4d ago | ||
| BrowseComp-zh | DS V3.2-Thinking | Accuracy65 | 21 | 13d ago | |
| BrowseComp+ (test) | Accuracy56.4 | 20 | 12d ago | ||
| BrowseComp (official) | Tendem’s AI agent | Exact Match71 | 5 | 1mo ago | |
| BrowseComp-Plus | Pass72 | 4 | 4d ago | ||
| WebArena | R2D2 | Accuracy27.3 | 3 | 1mo ago |