Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

WizardLM

Benchmarks

Task NameDataset NameSOTA ResultTrend
Instruction TuningWizardLM
Reasoning Score75.07
20
Instruction FollowingWizardLM (test)
Score6.87
13
Refusal behavior defenseWizardLM (test)
BadNet CACC90.4
12
Toxic behavior defenseWizardLM (test)
BadNet CACC0.904
12
Instruction FollowingWizardLM low-resource
Win Rate (bn)62.8
7
Instruction Following EvaluationWizardLM
Score72.06
5
GenerationWizardLM (test)
LLM-as-a-Judge Score48.37
2
Showing 7 of 7 rows