Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Evaluation Suite (LAMBDA, RACE, ARC-E, ARC-C)

Benchmarks

Task NameDataset NameSOTA ResultTrend
Zero-shot EvaluationZero-shot Evaluation Suite (LAMBDA, RACE, ARC-E, ARC-C)
LAMBDA Score61.73
6
Showing 1 of 1 rows