Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Counterfactual Reasoning on CRAFT Hard Split (test)

83.64Accuracy

CRCG_GPT4

51.285659.685368.08576.4847Jun 12, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.06
83.64
2025.06
81.22
2025.06
81.2
2025.06
79.34
2025.06
68.48
2025.06
65.89
2025.06
56
2025.06
52.53