Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Counterfactual reasoning

Benchmarks

Task NameDataset NameSOTA ResultTrend
Counterfactual reasoningCounterfactual reasoning Agent synthetic (test)
Accuracy99.7
16
Counterfactual reasoningCounterfactual reasoning Human Amazon Mechanical Turk (test)
Metric-
0
Showing 2 of 2 rows