Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

TweetEval

Benchmarks

Task NameDataset NameSOTA ResultTrend
DetectionTweetEval hate
Macro F167.53
21
Tweet ClassificationTweetEval 1.0 (test)
Emoji (M-F1)34.2
18
Emotion ClassificationTweetEval Emotion
Macro-F168.74
15
Twitter Text ClassificationTweetEval latest (test)
Emoji0.297
9
Text ClassificationTweetEval Offensive (test)
Accuracy69.14
8
Text ClassificationTweetEval Hate (test)
Accuracy55.72
8
PruningTweetEval T-Sentiment (test)
AU-MSE1.23
8
PruningTweetEval T-Hate (test)
AU-MSE4.88
8
PruningTweetEval T-Emotions (test)
AU-MSE1.47
8
Irony DetectionTweetEval irony (test)
Accuracy84.18
7
DetectionTweetEval offensive
Macro F168.3
6
DetectionTweetEval irony
Macro F162.7
6
DetectionTweetEval stance-feminist (test)
Macro F141.3
6
Safety EvaluationTweetEval
F172
3
DetectionTweetEval stance-atheism
Macro F127.4
3
DetectionTweetEval stance-atheism (TW-A) (test)
Macro-F10.285
3
DetectionTweetEval-offensive (Tw-O) (test)
Macro F156.3
3
DetectionTweetEval-hate (Tw-H) (test)
Macro F159
3
Showing 18 of 18 rows