Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

TRACE

Benchmarks

Task NameDataset NameSOTA ResultTrend
Continual LearningTRACE
BWT (%)18.5
124
Continual LearningTRACE LLM-CL
Overall Performance (OP)43.82
33
General language understanding and reasoningTRACE
C-STANCE Accuracy59
29
Trace-level safety monitoringTRACE
ROC-AUC90.1
28
Trace-level detectionTRACE
AP76.4
28
Sabotage DetectionTRACE
Average Precision90.1
28
Case-level safety detectionTRACE
ROC-AUC0.901
28
Continual LearningTRACE (test)
Overall Performance Score57.8
25
Continual Instruction TuningTRACE
AP58.83
24
Image Forgery DetectionTRACE (test)
Accuracy97.2
18
Visual ReasoningTRACE (test)
BLEU-10.346
17
Continual LearningTRACE (overall)
Accuracy68.1
15
Continual LearningTRACE sequence
GP Score55.4
12
Forgery GroundingTRACE (test)
IoU35.9
10
Node-level DetectionE3-Trace DARPA
Precision41
10
Continual LearningTRACE Llama2-7B-chat
Average Accuracy55.2
9
Post-Click GMV PredictionTRACE online learning (days 57 to 82)
AUC84.86
9
Video Re-SynthesisTRACE manually curated (test)
PSNR1,492.56
8
Instruction-followingTRACE
AA57.6
7
Blockchain Anomaly DetectionTrace (test)
AUC-PRC3.86
6
Time Series ClassificationTrace (test)
Accuracy100
5
Advanced Persistent Threat DetectionTRACE
Precision99.9
4
3-class segmentationTrace 3-class (test)
Precision99.804
3
Time Series Chain DiscoveryTrace
Hit Rate42
3
Computational Efficiency AnalysisTRACE Evaluation Samples
VRAM (GB)2
2
Showing 25 of 25 rows