Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Web Navigation Question Answering on WebWalker QA (avg@3)
Loading...
72.2
Avg@3
Tongyi-DeepResearch-30B
60.136
63.268
66.4
69.532
Nov 14, 2025
Avg@3
Updated 1mo ago
Evaluation Results
Method
Method
Links
Avg@3
Tongyi-DeepResearch-30B
Type=Research Agents
2025.11
72.2
OpenAI-o3
Type=Foundation Models...
2025.11
71.7
AFM-32B-RL
Type=Research Agents
2025.11
63
WebExplorer-8B-RL
Type=Research Agents
2025.11
62.7
MiroThinker-v1.0-72B
Parameters=72B
2025.11
62.1
Claude-4-Sonnet
Type=Foundation Models...
2025.11
61.7
DeepSeek-V3.1
Type=Foundation Models...
2025.11
61.2
MiroThinker-v1.0-30B
Parameters=30B
2025.11
61
MiroThinker-v1.0-8B
Parameters=8B
2025.11
60.6
Feedback
Search any
task
Search any
task