WizardLM

Benchmarks

Task Name	Dataset Name	SOTA Result
Instruction Following	WizardLM (test)	Score6.87	25
Instruction Tuning	WizardLM	Reasoning Score75.07	20
Refusal behavior defense	WizardLM (test)	BadNet CACC90.4	12
Toxic behavior defense	WizardLM (test)	BadNet CACC0.904	12
Fine-tuning	WizardLM	Evaluation Loss0.661	7
Instruction Following	WizardLM low-resource	Win Rate (bn)62.8	7
Instruction Following Evaluation	WizardLM	Score72.06	5
Generation	WizardLM (test)	LLM-as-a-Judge Score48.37	2

Showing 8 of 8 rows