Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Causal Reasoning on Cladder AceReason (Complete)
Loading...
81.2
Accuracy
Model-first Greedy
71.632
74.116
76.6
79.084
May 21, 2026
Accuracy
Updated 8d ago
Evaluation Results
Method
Method
Links
Accuracy
Model-first Greedy
k=5, Summarizer=AceReason
2026.05
81.2
Input-all
k=5, Summarizer=AceReason
2026.05
80.1
Best-model
k=5, Summarizer=AceReason
2026.05
78.6
Top-accuracy
k=5, Summarizer=AceReason
2026.05
77.7
Truth-prediction Greedy
k=5, Summarizer=AceReason
2026.05
76.2
GPT5.2-judge
k=5, Summarizer=AceReason
2026.05
76
MoA
k=5, Summarizer=AceReason
2026.05
75.7
Oracle-surrogate Greedy
k=5, Summarizer=AceReason
2026.05
75.2
Conditioned-diversity
k=5, Summarizer=AceReason
2026.05
74.2
Aya-judge
k=5, Summarizer=AceReason
2026.05
72
Feedback
Search any
task
Search any
task