Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

TweetEval

Benchmarks

Task NameDataset NameSOTA ResultTrend
Text ClassificationTweetEval
Accuracy72.69
112
Text ClassificationTweetEVAL (test)
Accuracy (A)84.17
44
DetectionTweetEval hate
Macro F167.53
21
Tweet ClassificationTweetEval 1.0 (test)
Emoji (M-F1)34.2
18
Emotion ClassificationTweetEval Emotion
Macro-F168.74
15
Twitter Text ClassificationTweetEval latest (test)
Emoji0.297
9
Sentiment PredictionTweetEval (IID)
Accuracy53.1
8
Text ClassificationTweetEval Offensive (test)
Accuracy69.14
8
Text ClassificationTweetEval Hate (test)
Accuracy55.72
8
PruningTweetEval T-Sentiment (test)
AU-MSE1.23
8
PruningTweetEval T-Hate (test)
AU-MSE4.88
8
PruningTweetEval T-Emotions (test)
AU-MSE1.47
8
Irony DetectionTweetEval irony (test)
Accuracy84.18
7
DetectionTweetEval offensive
Macro F168.3
6
DetectionTweetEval irony
Macro F162.7
6
DetectionTweetEval stance-feminist (test)
Macro F141.3
6
Safety EvaluationTweetEval
F172
3
DetectionTweetEval stance-atheism
Macro F127.4
3
DetectionTweetEval stance-atheism (TW-A) (test)
Macro-F10.285
3
DetectionTweetEval-offensive (Tw-O) (test)
Macro F156.3
3
DetectionTweetEval-hate (Tw-H) (test)
Macro F159
3
Showing 21 of 21 rows