Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

HUB

Benchmarks

Task NameDataset NameSOTA ResultTrend
Sarcasm RankingHUB
Ranking Score65.5
9
Sarcasm MatchingHUB
Matching Rate64.5
9
Hallucination DetectionHUB (test)
Algorithmic76.8
7
Showing 3 of 3 rows