Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Recovering canonical instrumental variables on Military service → Earning
Loading...
74
EM
IV Co-Scientist
26.16
38.58
51
63.42
Feb 8, 2026
EM
CM
Updated 4d ago
Evaluation Results
Method
Method
Links
EM
CM
IV Co-Scientist
Backbone=GPT-4o
2026.02
74
100
IV Co-Scientist
Backbone=QwQ
2026.02
74
100
IV Co-Scientist
Backbone=o3-mini
2026.02
73
100
IV Co-Scientist
Backbone=Llama3.1 70B
2026.02
61
84
IV Co-Scientist
Backbone=Llama3.1 8B
2026.02
28
42
Feedback
Search any
task
Search any
task