Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Agentic Orchestration on GAIA
Loading...
80
Accuracy
AORCHESTRA
45.368
54.359
63.35
72.341
Feb 3, 2026
Accuracy
Average Cost
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Average Cost
AORCHESTRA
LM=Gemini-3-Flash
2026.02
80
0.79
AORCHESTRA (ICL)
Optimization=In-contex...
2026.02
75.15
0.57
AORCHESTRA (Gemini-3-Flash Orchestrator)
LM Pool=Mixed
2026.02
72.12
0.7
AORCHESTRA
LM=Claude-4.5-sonnet
2026.02
71.52
0.91
AORCHESTRA (Qwen3-8B SFT Orchestrator)
Training=Supervised Fi...
2026.02
68.48
0.68
AORCHESTRA
LM=Deepseek-v3.2
2026.02
67.87
0.14
AORCHESTRA
LM=GPT-5-mini
2026.02
67.27
0.28
AORCHESTRA (Qwen3-8B Orchestrator)
LM Pool=Gemini-3-Flash
2026.02
56.97
0.36
ReAct
LM=GPT-5-mini
2026.02
54.55
0.052
ReAct
LM=Claude-4.5-sonnet
2026.02
53.93
0.19
ReAct
LM=Gemini-3-Flash
2026.02
49.09
0.07
ReAct
LM=Claude-4.5-haiku
2026.02
47.88
0.066
ReAct
LM=Deepseek-v3.2
2026.02
46.7
0.027
Feedback
Search any
task
Search any
task