Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mistral-7B

Benchmarks

Task NameDataset NameSOTA ResultTrend
Model RetrievalMistral-7B model tree (test)
Rank1
21
Targeted RefusalMistral-7B Generation Evaluation Set
CA97.01
15
Sentiment SteeringMistral-7B Generation (Evaluation Set)
Control Accuracy (CA)96.38
15
Jailbreak DefenseMistral-7B Jailbreak Evaluation
GCG Attack Success Rate0
6
Text GenerationMistral-7B v0.3 (test)
S-BLEU34.2
3
Showing 5 of 5 rows