Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

BenchMarker

Benchmarks

Task NameDataset NameSOTA ResultTrend
Writing flaw detectionBenchMarker Writing - Out of Domain, Human
Accuracy92.6
27
Writing flaw detectionBenchMarker Writing - In Domain, NLP
Accuracy81.5
27
Shortcut detectionBenchMarker Shortcuts
Accuracy81.6
26
Contamination detectionBenchMarker
Accuracy71.2
11
Showing 4 of 4 rows