Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mistral

Benchmarks

Task NameDataset NameSOTA ResultTrend
Large Language Model WatermarkingMistral-7B-Instruct (test)
Perplexity (PPL)1.37
34
Language ModelingMistral-7B
Perplexity (Mistral-7B)5.45
24
Jailbreak AttackMistral-7B
NR40
20
Hallucination TracingMistral
Recall@k78.95
15
LLM fingerprintingMistral-7B
AUC1
10
Jailbreak DefenseMistral-7B-Instruct
GCG Attack Count4
6
Peak VRAM measurementMistral-Sm-24B
Peak VRAM (RTX)70.6
4
KV Cache QuantizationMistral
ΔPPL0.0012
3
Model Lineage AttestationMistral family
TPR0.99
1
Showing 9 of 9 rows