Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Scientific Discovery on NHO
Loading...
0.9636
Solution Quality (SQ)
PIEVO
0.566944
0.669922
0.7729
0.875878
Feb 6, 2026
Solution Quality (SQ)
Updated 4d ago
Evaluation Results
Method
Method
Links
Solution Quality (SQ)
PIEVO
Backbone=Qwen3-32B, Th...
2026.02
0.9636
PIEVO
Backbone=Gemini-2.5-Fl...
2026.02
0.8798
The AI Scientist v1
Backbone=Qwen3-32B, Th...
2026.02
0.7976
PiFlow
Backbone=Qwen3-32B, Th...
2026.02
0.7968
PiFlow
Backbone=Gemini-2.5-Fl...
2026.02
0.7932
AI Researcher
Backbone=Qwen3-32B, Th...
2026.02
0.7905
The AI Scientist v2
Backbone=Gemini-2.5-Fl...
2026.02
0.7785
The AI Scientist v1
Backbone=Gemini-2.5-Fl...
2026.02
0.774
ReAct
Backbone=Gemini-2.5-Fl...
2026.02
0.737
AI Researcher
Backbone=Gemini-2.5-Fl...
2026.02
0.7246
The AI Scientist v2
Backbone=Qwen3-32B, Th...
2026.02
0.715
ReAct
Backbone=Qwen3-32B, Th...
2026.02
0.6702
Vanilla MAS
Backbone=Gemini-2.5-Fl...
2026.02
0.6377
Vanilla MAS
Backbone=Qwen3-32B, Th...
2026.02
0.5822
Feedback
Search any
task
Search any
task