Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mistral-7B

Benchmarks

Task NameDataset NameSOTA ResultTrend
Model RetrievalMistral-7B model tree (test)
Rank1
21
Targeted RefusalMistral-7B Generation Evaluation Set
CA97.01
15
Sentiment SteeringMistral-7B Generation (Evaluation Set)
Control Accuracy (CA)96.38
15
Model Fingerprinting DetectionMistral-7B black-box setting v0.3
True Positive Rate (TPR)98.4
10
Language ModelingMistral-7B Long-context (8k window)
Perplexity4.568
8
Language ModelingMistral-7B Long-context (4k window)
Perplexity5.241
8
Jailbreak DefenseMistral-7B Jailbreak Evaluation
GCG Attack Success Rate0
6
Adversarial AttackMistral-7B (successful attacks)
Unique Queries3,021
3
Text GenerationMistral-7B v0.3 (test)
S-BLEU34.2
3
Weight Quantization FidelityMistral-7B
MSE5.36
2
Showing 10 of 10 rows