Share your thoughts, 1 month free Claude Pro on usSee more

Multi-task Evaluation on Aggregate (LAMBADA, HellaSwag, PIQA, ARC, WinoGrande)

51.9Avg Accuracy

Mistral (Full-Attention)

Updated 4mo ago

Evaluation Results

Method	Links
Mistral (Full-Attention) 2024.07		51.9
BMoJo (Fading + Eidetic) 2024.07		49.1
BMoJo (Fading) 2024.07		48.9
Mamba (SSM) 2024.07		48.6
Hybrid (Sliding Attention + SSM) 2024.07		44.8
Mistral (Full-Attention) 2024.07		41.4
Mamba (SSM) 2024.07		41.2
BMoJo (Fading) 2024.07		40.7
BMoJo (Fading + Eidetic) 2024.07		40.5
Hybrid (Sliding Attention + SSM) 2024.07		39.3