Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Wikitext, C4, HellaSwag, MMLU, ARC, MATH, and GSM8K

Benchmarks

Task NameDataset NameSOTA ResultTrend
Quantization Robustness EvaluationAverage across Wikitext, C4, HellaSwag, MMLU, Arc-C, MATH500, and GSM8K
Accuracy Loss Delta (%)-0.29
5
Showing 1 of 1 rows