Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

WG

Benchmarks

Task NameDataset NameSOTA ResultTrend
Common Sense ReasoningWG
Accuracy94.1
38
Commonsense ReasoningWG-S
Accuracy70.9
18
Harmful RefusalWG (test)
ASR11.6
16
EEG ClassificationWG
Accuracy0.7321
6
Showing 4 of 4 rows