Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Zero-shot tasks

Benchmarks

Task NameDataset NameSOTA ResultTrend
Zero-shot EvaluationZero-shot Tasks Average
Accuracy68.05
95
Zero-shot Commonsense ReasoningZero-shot tasks (test)
Average Accuracy61
12
Showing 2 of 2 rows