| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Safety Alignment | Do-Not-Answer | MD0 | 52 | |
| Safety Evaluation | Do-Not-Answer | MD Rate62.26 | 16 | |
| Refusal Detection | Do-Not-Answer Portuguese (test) | Accuracy100 | 9 | |
| Question Answering | Do-Not-Answer Portuguese Verbose Questions translated and adapted (61 question-answer pairs) | Mean Accuracy4.01 | 9 | |
| Question Answering | Do-Not-Answer Portuguese Direct Questions translated and adapted (61 question-answer pairs) | Mean Accuracy Score4.14 | 9 | |
| Safety Evaluation | Do-Not-Answer (test) | ASR3.195 | 9 | |
| Refusal Evaluation | Do-Not-Answer | Refusal Rate95.21 | 7 | |
| Jailbreak Attack Evaluation | Do-Not-Answer | ASR2.5 | 6 | |
| Safety Evaluation | Do-not-Answer | Safety Score69.9 | 4 | |
| Language Modeling | Do-Not-Answer | PPL154.81 | 1 |