Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ARC-Challenge, ARC-Easy, BoolQ, CrowS-Pairs, OpenBookQA, PIQA, RACE, SiQA, TruthfulQA, WinoGrande

Benchmarks

Task NameDataset NameSOTA ResultTrend
Zero-shot EvaluationARC-Challenge, ARC-Easy, BoolQ, CrowS-Pairs, OpenBookQA, PIQA, RACE, SiQA, TruthfulQA, WinoGrande zero-shot
ARC-C Accuracy52.1
26
Showing 1 of 1 rows