| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Safety Evaluation | HEx-PHI | HEx-PHI Score97.2 | 148 | |
| Backdoor Poisoning Attack | HEx-PHI (150 questions) | ASR (No Trigger)18.94 | 20 | |
| Identity Shifting Attack | HEx-PHI Identity Shifting Attack (300 questions) | ASR22.83 | 20 | |
| Jailbreak Attack | HEX-PHI | ASR54.9 | 16 | |
| Jailbreak Defense | HEX-PHI | Harmful Score1.74 | 16 | |
| Prosocial Alignment | HEx-PHI (test) | MIP76.3 | 14 | |
| Jailbreak Attack Success Rate | HEx-PHI (test) | ASR Category 186.67 | 12 | |
| Safety Evaluation | HEX-PHI (test) | ASR7.333 | 12 | |
| Safety Evaluation | HEx-PHI Direct | Safety Score (1-ASR)99.67 | 8 |