Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Downstream Suite

Benchmarks

Task NameDataset NameSOTA ResultTrend
Zero-shot Downstream AccuracyDownstream Suite Zero-shot (BoolQ, HellaSwag, PIQA, RACE, WinoGrande)
BoolQ Accuracy82.4
19
Zero-shot Question Answering and ReasoningDownstream Suite Zero-shot (PIQA, HS, ARC, WG, RTE, OQA, BoolQ)
PIQA Accuracy80.79
12
Showing 2 of 2 rows