Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Speech

Benchmarks

Task NameDataset NameSOTA ResultTrend
Voiced/Unvoiced DetectionSpeech
V/UV Recall94.21
50
Audio GenerationSpeech clean (test)
Generation Success Rate100
18
Audio GenerationSpeech Noise SNR=-10 (test)
Success Rate80
18
Anomaly DetectionSpeech
AUC-ROC0.6073
16
Tabular Anomaly DetectionSpeech
AUC-ROC0.676
14
Anomaly DetectionSpeech ODDS
AUC62.4
12
Speech SynthesisSpeech Industrial Setting
MOS Prediction4.29
11
Speech SynthesisSpeech Academic Setting
MOS Prediction3.65
11
Speech SeparationSpeech (test)
SI-SDRi18.8
11
Fundamental Frequency EstimationSpeech SNR 0 dB
RPA5013.66
10
Fundamental Frequency EstimationSpeech SNR 10 dB
RPA5068.85
10
Fundamental Frequency EstimationSpeech SNR 20 dB
RPA5078.01
10
Fundamental Frequency EstimationSpeech SNR 30 dB
RPA5080.24
10
Fundamental Frequency EstimationSpeech SNR ∞
RPA5080.91
10
Text-prompted separationSpeech
SAJ4.67
9
Audio ReconstructionSpeech
MUSHRA90.5
6
Audio Quality AssessmentSpeech
PCC Overall0.883
5
Speech to Sound generationSpeech-S
WER (%)6.15
3
Audio-to-Text RetrievalSpeech (test)
R@10.51
3
Text-to-Audio RetrievalSpeech (test)
R@17.1
3
Showing 20 of 20 rows