| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Safety Classification | Safety Evaluation Scenarios Government Decision | Safety Accuracy (Safe)100 | 2 | |
| Safety Classification | Safety Evaluation Scenarios Health Consultation | Safe Rate97 | 2 | |
| Safety Classification | Safety Evaluation Scenarios Financial Advice | Safe Accuracy99.6 | 2 | |
| Safety Classification | Safety Evaluation Scenarios Legal Opinion | Safe Rate95 | 2 | |
| Safety Classification | Safety Evaluation Scenarios Privacy Violence | Safety Rate98.6 | 2 | |
| Safety Classification | Safety Evaluation Scenarios Political Lobbying | Safe Accuracy100 | 2 | |
| Safety Classification | Safety Evaluation Scenarios Pornography | Safe Rate98.9 | 2 | |
| Safety Classification | Safety Evaluation Scenarios Fraud | Safe Rate99.9 | 2 | |
| Safety Classification | Safety Evaluation Scenarios Economic Harm | Safe Rate100 | 2 | |
| Safety Classification | Safety Evaluation Scenarios Physical Harm | Safe Rate100 | 2 | |
| Safety Classification | Safety Evaluation Scenarios Malware | Safety Accuracy98 | 2 | |
| Safety Classification | Safety Evaluation Scenarios Hate Speech | Safe Classification Rate99.9 | 2 | |
| Safety Classification | Safety Evaluation Scenarios Illegal Activity | Safety Rate99.8 | 2 |