Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Scientific Discovery on Second Autocorr. Inequality
Loading...
0.781
Step 1 Mean Score
CausalEvolve
0.72068
0.73634
0.752
0.76766
Mar 15, 2026
Step 1 Mean Score
Step 1 Best Score
Step 2 Mean Score
Step 2 Best Score
Step 3 Mean Score
Step 3 Best Score
Step 4 Mean Score
Step 4 Best Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Step 1 Mean Score
Step 1 Best Score
Step 2 Mean Score
Step 2 Best Score
Step 3 Mean Score
Step 3 Best Score
Step 4 Mean Score
Step 4 Best Score
CausalEvolve
Backbone=Grok-4.1-FR
2026.03
0.781
0.8
0.783
0.805
0.79
0.809
0.793
0.809
COAT
Backbone=Grok-4.1-FR
2026.03
0.753
0.77
0.771
0.783
0.773
0.783
0.783
0.786
CausalPlanner (Meta)
Backbone=Grok-4.1-FR
2026.03
0.73
0.745
0.734
0.749
0.735
0.75
0.736
0.75
ShinkaEvolve
Backbone=Grok-4.1-FR
2026.03
0.723
0.724
0.729
0.739
0.735
0.749
0.737
0.751
Feedback
Search any
task
Search any
task