Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Causal Reasoning on CLadder
Loading...
88
Exact Match
Llama3.1-8B-Instruct
55.76
64.13
72.5
80.87
Jan 29, 2026
Exact Match
LLM Score
DOVERIFIER Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Exact Match
LLM Score
DOVERIFIER Score
Llama3.1-8B-Instruct
Parameters=8B, Backbon...
2026.01
88
66
90
Gemma-7B-it
Parameters=7B, Backbon...
2026.01
80
58
84
Mistral-7B
Parameters=7B, Backbon...
2026.01
58
80
94
Llama3.1-8B
Parameters=8B, Backbon...
2026.01
57
60
73
Feedback
Search any
task
Search any
task