Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Scientific Discovery on SPO
Loading...
37.85
SQ (%)
PIEVO
28.3444
30.8122
33.28
35.7478
Feb 6, 2026
SQ (%)
Updated 4d ago
Evaluation Results
Method
Method
Links
SQ (%)
PIEVO
Backbone=Gemini-2.5-Fl...
2026.02
37.85
PIEVO
Backbone=Qwen3-32B, Th...
2026.02
37.33
PiFlow
Backbone=Gemini-2.5-Fl...
2026.02
37.31
AI Researcher
Backbone=Gemini-2.5-Fl...
2026.02
36.95
The AI Scientist v2
Backbone=Gemini-2.5-Fl...
2026.02
36.74
The AI Scientist v1
Backbone=Gemini-2.5-Fl...
2026.02
36.57
PiFlow
Backbone=Qwen3-32B, Th...
2026.02
33.99
Vanilla MAS
Backbone=Gemini-2.5-Fl...
2026.02
32.57
ReAct
Backbone=Gemini-2.5-Fl...
2026.02
32.57
The AI Scientist v1
Backbone=Qwen3-32B, Th...
2026.02
30.52
ReAct
Backbone=Qwen3-32B, Th...
2026.02
30.51
AI Researcher
Backbone=Qwen3-32B, Th...
2026.02
30.24
The AI Scientist v2
Backbone=Qwen3-32B, Th...
2026.02
30.01
Vanilla MAS
Backbone=Qwen3-32B, Th...
2026.02
28.71
Feedback
Search any
task
Search any
task