Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

all

Benchmarks

Task NameDataset NameSOTA ResultTrend
Legal Contract RevisionALL Avg
CQ Score86.87
25
Machine Translation (English to Hindi)All weighted average (test)
BLEU Score0.0675
14
Machine Translation (Hindi to English)All weighted average (test)
BLEU Score10.28
14
Point Cloud Quality AssessmentALL
PLCC0.913
12
Machine TranslationALL Average of two language pairs in four directions wmt22-comet-da
COMET85.8
12
Organ SegmentationAll 121 classes v1 (test)
DSC90.49
10
Generative SearchingAll-50K (test)
HR@18.8
9
Word Sense DisambiguationALL (test)
F1 Score82
8
Word Sense LinkingALL FULL
Precision80.4
5
Video Action RecognitionAll (Avg.)
Base Score65.5
5
Wide-angle portrait correctionall (test)
Line Accuracy66.784
4
Aggregate PerformanceAll Average
Accuracy40.3
3
Word Sense LinkingALL FULL (test)
Precision80.4
3
Anomaly DetectionAll MVTec-AD, VisA, MPDD, BTAD combined
I-AUROC95.4
2
Decision MakingAll Aggregated (UK-based participants)
Final Accuracy5.2
1
Showing 15 of 15 rows