| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Model Retrieval | Mistral-7B model tree (test) | Rank1 | 21 | |
| Targeted Refusal | Mistral-7B Generation Evaluation Set | CA97.01 | 15 | |
| Sentiment Steering | Mistral-7B Generation (Evaluation Set) | Control Accuracy (CA)96.38 | 15 | |
| Jailbreak Defense | Mistral-7B Jailbreak Evaluation | GCG Attack Success Rate0 | 6 | |
| Text Generation | Mistral-7B v0.3 (test) | S-BLEU34.2 | 3 |