Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Avg over 11 datasets

Benchmarks

Task NameDataset NameSOTA ResultTrend
Base-to-New GeneralizationAvg over 11 datasets
Base Score88.11
90
Showing 1 of 1 rows