Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
General AI Assistant Reasoning on GAIA text
Loading...
43.7
Pass@1
Reagent-U
20.508
26.529
32.55
38.571
Jan 29, 2026
Pass@1
Pass@3
Updated 4d ago
Evaluation Results
Method
Method
Links
Pass@1
Pass@3
Reagent-U
2026.01
43.7
53.4
MCP-R1
2026.01
39.8
52.4
Qwen3-8B
Model size=8B
2026.01
21.4
24.3
Feedback
Search any
task
Search any
task