Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

CREAK

Benchmarks

Task NameDataset NameSOTA ResultTrend
False Premise DetectionCREAK
TP (%)92.8
12
Fact VerificationCreak
Accuracy0.956
8
Commonsense Fact VerificationCREAK (test)
Accuracy88.6
5
Commonsense Fact VerificationCREAK Contrast Set (contra)
Accuracy92.2
4
Showing 4 of 4 rows