| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Adversarial Attack | ED | GL38.82 | 18 | |
| Multi-turn role-play | ED | Success Rate (SR)92.1 | 12 | |
| Entity Disambiguation | 6 standard ED | Avg ED F190 | 6 | |
| Empathetic Dialogue | ED | Success Rate (SR)92.1 | 5 | |
| Stance Detection | ED Erdoğan Turkish election data (test) | Precision (PRO)99 | 4 | |
| Procedure Coding | ED CPT | Recall48.8 | 2 |