Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CRASS

Benchmarks

Task NameDataset NameSOTA ResultTrend
Counterfactual ReasoningCRASS
Exact Match Performance94.53
11
Causal Variable IdentificationCRASS
F1 (X)92.3
7
Outcome ReasoningCRASS
M' (F1 Mean)92.1
7
Logic-heavy ReasoningCRASS
Score56.22
6
Showing 4 of 4 rows