Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Amazon Yelp

Benchmarks

Task NameDataset NameSOTA ResultTrend
ClassificationAmazon/Yelp Spurious 10% (test)
F1 Score49.22
9
ClassificationAmazon Yelp Spurious 30% (test)
F1 Score59.68
9
ClassificationAmazon Yelp Spurious 50% (test)
F1 Score70.08
9
ClassificationAmazon Yelp Spurious 70% (test)
F1 Score80.34
9
ClassificationAmazon Yelp Spurious 90% (test)
F1 Score91.39
9
ClassificationAmazon/Yelp Spurious 90% (train)
F1 Score99.99
8
Domain Shift ClassificationAmazon -> Yelp (test)
ECE5.8
6
Showing 7 of 7 rows