Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

QA Suite

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringQA Suite Zero-shot (PIQA, ARC-E, ARC-C, BoolQ, HellaSwag, WinoGrande)
PIQA Accuracy80.85
84
Showing 1 of 1 rows