| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Annotation Accuracy | DeepSeek-R1 Experiment 1 | F1 Score (Ga)100 | 40 | |
| LLM Attack Effectiveness | DeepSeek-R1-Distill-Llama-8B serving environment | TTFT (s)0.08 | 6 | |
| Text Naturalness Evaluation | DeepSeek-R1 Experiment 2 | BERT Score0.99 | 5 |