| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Causal Discovery | Ours Noisy | AUROC82.3 | 9 | |
| Causal Discovery | Ours Original | AUROC0.821 | 9 | |
| Instruction Following Evaluation | Ours hard seed data | Score56.73 | 5 | |
| Language Detoxification | Ours (test) | Overall Offensiveness Score1.145 | 5 | |
| Makeup Transfer | Ours (test) | FID11.67 | 4 |