Share your thoughts, 1 month free Claude Pro on usSee more

Natural Language Understanding on BIG-Bench Hard (BBH)

42.1Accuracy

Arcana

Updated 4mo ago

Evaluation Results

Method	Links
Arcana 2024.10		42.1
Vicuna-v1.5 2024.10		41.2
LLaMA-2 2024.10		38.2
LLaMA-2-Chat 2024.10		35.6
WizardLM 2024.10		34.7