Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Causal Estimation on Synthetic
Loading...
100
MSA
Claude Code Raw
74.936
81.443
87.95
94.457
Apr 2, 2026
MSA
MRE
Updated 16d ago
Evaluation Results
Method
Method
Links
MSA
MRE
Claude Code Raw
Time (s)=34, Cost ($)=...
2026.04
100
10.6
CAIS Skill (Full)
Time (s)=59, Cost ($)=...
2026.04
100
4.1
MAS Compiler
Time (s)=115, Cost ($)...
2026.04
100
12.9
Structured Pipeline
Time (s)=56, Cost ($)=...
2026.04
100
5.5
Adaptive Skill
Time (s)=43, Cost ($)=...
2026.04
100
6.3
Auto Opt.
Time (s)=71, Cost ($)=...
2026.04
100
6
Tools Only
Time (s)=43, Cost ($)=...
2026.04
95.6
5.5
Knowledge Only
Time (s)=55, Cost ($)=...
2026.04
91.1
4.3
CAIS (MAS)
Time (s)=123, Cost ($)...
2026.04
75.9
16.2
Feedback
Search any
task
Search any
task