| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Knowledge Unlearning | WMDP bio | Accuracy71.2 | 42 | |
| Question Answering | WMDP Biology | Default Score64.5 | 38 | |
| Question Answering | WMDP Cyber QA | Default Accuracy44.3 | 38 | |
| Knowledge Unlearning | WMDP cyber | Accuracy47.21 | 38 | |
| Knowledge Recovery | WMDP-Bio 100-sample subset | ASR0.93 | 36 | |
| Fluency Assessment | WMDP | Mean Fluency3.46 | 22 | |
| Machine Unlearning | WMDP Cyber (test) | MMLU60.65 | 21 | |
| Hazard Knowledge Evaluation | WMDP | Accuracy68.98 | 18 | |
| Machine Unlearning | WMDP | Unlearn Score76.1 | 16 | |
| Unlearning Detection | WMDP | Accuracy100 | 16 | |
| Machine Unlearning | WMDP | Bio Score64.7 | 15 | |
| Machine Unlearning | WMDP | Acc (Bio)74.16 | 12 | |
| Knowledge Unlearning | WMDP Bio (test) | Accuracy Forget64.81 | 11 | |
| Harmful Knowledge Removal | WMDP Bio | Acc_r78.5 | 10 | |
| Tracing | WMDP (test) | TSR100 | 10 | |
| Machine Unlearning | WMDP average of biology and cyber | Accuracy0.557 | 10 | |
| Machine Unlearning | WMDP Bio (test) | Bio Score63.7 | 10 | |
| Unlearning | WMDP bio | WMDPbio Score0.628 | 9 | |
| Unlearning | WMDP | WMDP Score0.489 | 9 | |
| Machine Unlearning | WMDP bio | Multi-turn ASR Error Rate3.1 | 9 | |
| Machine Unlearning | WMDP bio | Accuracy48.9 | 9 | |
| Machine Unlearning | WMDP | Accuracy39.7 | 9 | |
| Machine Unlearning | WMDP Cyber | Rel7.19 | 9 | |
| Machine Unlearning | WMDP Bio | Rel Score6.72 | 9 | |
| Question Answering | WMDP multiple-choice QA | Bio Accuracy65.1 | 9 |