Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Speech

Benchmarks

Task NameDataset NameSOTA ResultTrend
Voiced/Unvoiced DetectionSpeech
V/UV Recall94.21
50
Anomaly DetectionSpeech
AUC-ROC0.9849
33
Audio GenerationSpeech clean (test)
Generation Success Rate100
18
Audio GenerationSpeech Noise SNR=-10 (test)
Success Rate80
18
Outlier Detectionspeech (historical)
AUROC54.88
17
Outlier Detectionspeech (Group II)
AUROC54.88
14
Tabular Anomaly DetectionSpeech
AUC-ROC0.676
14
Anomaly DetectionSpeech ODDS
AUC62.4
12
Speech SynthesisSpeech Industrial Setting
MOS Prediction4.29
11
Speech SynthesisSpeech Academic Setting
MOS Prediction3.65
11
Speech SeparationSpeech (test)
SI-SDRi18.8
11
Anomaly Detectionspeech Out-of-Domain
F1 Score10.49
10
Fundamental Frequency EstimationSpeech SNR 0 dB
RPA5013.66
10
Fundamental Frequency EstimationSpeech SNR 10 dB
RPA5068.85
10
Fundamental Frequency EstimationSpeech SNR 20 dB
RPA5078.01
10
Fundamental Frequency EstimationSpeech SNR 30 dB
RPA5080.24
10
Fundamental Frequency EstimationSpeech SNR ∞
RPA5080.91
10
Text-prompted separationSpeech
SAJ4.67
9
Outlier Detectionspeech
Precision-s10.65
8
Audio ReconstructionSpeech
MUSHRA90.5
6
Audio Quality AssessmentSpeech
PCC Overall0.883
5
Speech to Sound generationSpeech-S
WER (%)6.15
3
Audio-to-Text RetrievalSpeech (test)
R@10.51
3
Text-to-Audio RetrievalSpeech (test)
R@17.1
3
Showing 24 of 24 rows