| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Continual Learning | TRACE | BWT (%)18.5 | 124 | |
| Continual Learning | TRACE LLM-CL | Overall Performance (OP)43.82 | 33 | |
| General language understanding and reasoning | TRACE | C-STANCE Accuracy59 | 29 | |
| Trace-level safety monitoring | TRACE | ROC-AUC90.1 | 28 | |
| Trace-level detection | TRACE | AP76.4 | 28 | |
| Sabotage Detection | TRACE | Average Precision90.1 | 28 | |
| Case-level safety detection | TRACE | ROC-AUC0.901 | 28 | |
| Continual Learning | TRACE (test) | Overall Performance Score57.8 | 25 | |
| Continual Instruction Tuning | TRACE | AP58.83 | 24 | |
| Image Forgery Detection | TRACE (test) | Accuracy97.2 | 18 | |
| Visual Reasoning | TRACE (test) | BLEU-10.346 | 17 | |
| Continual Learning | TRACE (overall) | Accuracy68.1 | 15 | |
| Continual Learning | TRACE sequence | GP Score55.4 | 12 | |
| Forgery Grounding | TRACE (test) | IoU35.9 | 10 | |
| Node-level Detection | E3-Trace DARPA | Precision41 | 10 | |
| Continual Learning | TRACE Llama2-7B-chat | Average Accuracy55.2 | 9 | |
| Post-Click GMV Prediction | TRACE online learning (days 57 to 82) | AUC84.86 | 9 | |
| Video Re-Synthesis | TRACE manually curated (test) | PSNR1,492.56 | 8 | |
| Instruction-following | TRACE | AA57.6 | 7 | |
| Blockchain Anomaly Detection | Trace (test) | AUC-PRC3.86 | 6 | |
| Time Series Classification | Trace (test) | Accuracy100 | 5 | |
| Advanced Persistent Threat Detection | TRACE | Precision99.9 | 4 | |
| 3-class segmentation | Trace 3-class (test) | Precision99.804 | 3 | |
| Time Series Chain Discovery | Trace | Hit Rate42 | 3 | |
| Computational Efficiency Analysis | TRACE Evaluation Samples | VRAM (GB)2 | 2 |