Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

TweetEval

Benchmarks

Task NameDataset NameSOTA ResultTrend
Tweet ClassificationTweetEval 1.0 (test)
Emoji (M-F1)34.2
18
Twitter Text ClassificationTweetEval latest (test)
Emoji0.297
9
Text ClassificationTweetEval Offensive (test)
Accuracy69.14
8
Text ClassificationTweetEval Hate (test)
Accuracy55.72
8
PruningTweetEval T-Sentiment (test)
AU-MSE1.23
8
PruningTweetEval T-Hate (test)
AU-MSE4.88
8
PruningTweetEval T-Emotions (test)
AU-MSE1.47
8
Irony DetectionTweetEval irony (test)
Accuracy84.18
7
DetectionTweetEval offensive
Macro F168.3
6
DetectionTweetEval irony
Macro F162.7
6
DetectionTweetEval hate
Macro F161.2
6
DetectionTweetEval stance-feminist (test)
Macro F141.3
6
Safety EvaluationTweetEval
F172
3
DetectionTweetEval stance-atheism
Macro F127.4
3
DetectionTweetEval stance-atheism (TW-A) (test)
Macro-F10.285
3
DetectionTweetEval-offensive (Tw-O) (test)
Macro F156.3
3
DetectionTweetEval-hate (Tw-H) (test)
Macro F159
3
Showing 17 of 17 rows