Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Scientific Discovery on Hadamard Matrix
Loading...
55.6
Step 1 Mean Score
CausalPlanner (Meta)
49.256
50.903
52.55
54.197
Mar 15, 2026
Step 1 Mean Score
Step 1 Best Score
Step 2 Mean Score
Step 2 Best Score
Step 3 Mean Score
Step 3 Best Score
Step 4 Mean Score
Step 4 Best Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Step 1 Mean Score
Step 1 Best Score
Step 2 Mean Score
Step 2 Best Score
Step 3 Mean Score
Step 3 Best Score
Step 4 Mean Score
Step 4 Best Score
CausalPlanner (Meta)
Backbone=Grok-4.1-FR
2026.03
55.6
57.3
56.7
57.3
56.7
57.3
56.7
57.3
CausalEvolve
Backbone=Grok-4.1-FR
2026.03
54.2
57.4
55
57.4
56.3
57.6
56.8
57.6
COAT
Backbone=Grok-4.1-FR
2026.03
50.3
51.9
51.4
54.3
52.1
55.2
53.2
56.1
ShinkaEvolve
Backbone=Grok-4.1-FR
2026.03
49.5
53.3
52.1
54
52.1
54
52.1
54
Feedback
Search any
task
Search any
task