Counterfactual reasoning

Benchmarks

Dataset Name	SOTA Method	Metric
Counterfactual Eval (dev)	PE2	Mean Score63.4	52	4mo ago
CVQA		Accuracy71.41	40	4mo ago
MMLU-CF	GHS-TDA	EM71.6	30	4mo ago
CounterBench		Basic Score80.8	20	3mo ago
UCI Adult missing values Agent (test)	SHAP	Accuracy100	16	4mo ago
Counterfactual reasoning Agent synthetic (test)	LIME	Accuracy99.7	16	4mo ago
CRASS	GPT-4	Exact Match Performance94.53	11	4mo ago
CRAFT Hard Split (test)	CRCG_GPT4	Accuracy83.64	8	4mo ago
CRAFT Easy Split (test)	BERT-D	Accuracy80.05	8	4mo ago
MCQA (test)	PLOT-pca-DAS	AP (Exact Counterfactual Accuracy)90.75	6	2mo ago
Y-struct NADD	JANUS	MSE16,515	5	4mo ago
Diamond NADD	JANUS	MSE599	5	4mo ago
Triangle NADD	JANUS	MSE150	5	4mo ago
Chain NADD	JANUS	MSE48.8	5	4mo ago
OmniDrive	Omni-L	Safe Precision72.1	4	4mo ago
C-VQA	ViperGPT	Numerical Direct Accuracy80.6	4	4mo ago
SCM battery 300 random models per query		Mean Error (PN)0.18	3	1mo ago
Twin-EventLog Smallville context	substrate	Joint Accuracy (A ∧ B)100	3	2mo ago
SCM battery		Infeasible Rate28	1	1mo ago
UCI Adult missing values Human Survey (test)	-	-	0	4mo ago
Counterfactual reasoning Human Amazon Mechanical Turk (test)	-	-	0	4mo ago

Showing 21 of 21 rows