Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Causal Estimation on QRData
Loading...
89.7
MSA
CAIS Skill (Full)
60.372
67.986
75.6
83.214
Apr 2, 2026
MSA
MRE
Updated 16d ago
Evaluation Results
Method
Method
Links
MSA
MRE
CAIS Skill (Full)
Time (s)=59, Cost ($)=...
2026.04
89.7
29.7
Adaptive Skill
Time (s)=43, Cost ($)=...
2026.04
89.7
29.5
Knowledge Only
Time (s)=55, Cost ($)=...
2026.04
87.2
23.1
Auto Opt.
Time (s)=71, Cost ($)=...
2026.04
87.2
25.6
CAIS (MAS)
Time (s)=123, Cost ($)...
2026.04
83.3
54
Tools Only
Time (s)=43, Cost ($)=...
2026.04
69.2
28.3
Structured Pipeline
Time (s)=56, Cost ($)=...
2026.04
69.2
26.5
MAS Compiler
Time (s)=115, Cost ($)...
2026.04
64.1
26.3
Claude Code Raw
Time (s)=34, Cost ($)=...
2026.04
61.5
31.2
Feedback
Search any
task
Search any
task