Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Instrumental Variable Discovery on Gapminder Sanitation → Mortality
Loading...
11.37
Relevance
IV Co-Scientist
10.2364
10.5307
10.825
11.1193
Feb 8, 2026
Relevance
Cnorm
Updated 4d ago
Evaluation Results
Method
Method
Links
Relevance
Cnorm
IV Co-Scientist
Backbone LLM=GPT-4o
2026.02
11.37
0.508
IV Co-Scientist
Backbone LLM=o3-mini
2026.02
11.37
0.508
IV Co-Scientist
Backbone LLM=QwQ
2026.02
10.76
0.515
IV Co-Scientist
Backbone LLM=Llama3.1 8b
2026.02
10.76
0.515
IV Co-Scientist
Backbone LLM=Llama3.1 70b
2026.02
10.28
0.581
Feedback
Search any
task
Search any
task