Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

D2

Benchmarks

Task NameDataset NameSOTA ResultTrend
ClassificationD2
Mean Accuracy91.611
30
Aspect Sentiment Triplet ExtractionD2 (16Res)
F1 Score74.83
25
Aspect Sentiment Triplet ExtractionD2 15Res
F1 Score66.12
25
Aspect Sentiment Triplet ExtractionD2 14Lap
F1 Score63.61
25
Aspect Sentiment Triplet ExtractionD2 14Res
F1 Score75.59
25
Time Series ForecastingD2 Synthetic (test)
MSE0.599
16
Medical Image SegmentationD2
DSC87.35
14
RegressionD2
Average Relative MSE0.084
10
ClassificationD2 0.15 (test)
Mean Accuracy91.657
10
ICD-10 Code PredictionD2 noisy (test)
AUPRC (Z37)93.86
10
Outlier DetectionD2 with only clusteriers (test)
AUC0.918
9
Aspect-level sentiment classificationD2
Accuracy72.08
9
Knee cartilage segmentationD2
Dice94.14
7
Root Cause LocalizationD2 complete data conditions
Top-1 Accuracy81.5
7
Failure TriageD2 complete data conditions
Precision88.2
6
Anomaly DetectionD2 complete data conditions
Precision99.3
6
Time-Domain PredictionD2
NMSE (dB)-18.58
6
Reliability AssessmentD2 (test)
AU-ARC92.1
5
Frequency-Domain PredictionD2
NMSE (dB)-8.95
5
Trajectory PlanningD2 OOD-OV
RRPI3,103.8
4
Trajectory PlanningD2 OOD
RRPI4,372.4
4
CSI ReconstructionD2
NMSE (dB)-15.91
3
Root Cause LocalizationD2 (test)
Execution Time (s)8.08
2
Failure TriageD2 (test)
Execution Time (s)1.45
2
Anomaly DetectionD2 (test)
Execution Time (s)6.71
2
Showing 25 of 25 rows