Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DNSMOS P.835: A Non-Intrusive Perceptual Objective Speech Quality Metric to Evaluate Noise Suppressors

About

Human subjective evaluation is the gold standard to evaluate speech quality optimized for human perception. Perceptual objective metrics serve as a proxy for subjective scores. We have recently developed a non-intrusive speech quality metric called Deep Noise Suppression Mean Opinion Score (DNSMOS) using the scores from ITU-T Rec. P.808 subjective evaluation. The P.808 scores reflect the overall quality of the audio clip. ITU-T Rec. P.835 subjective evaluation framework gives the standalone quality scores of speech and background noise in addition to the overall quality. In this work, we train an objective metric based on P.835 human ratings that outputs 3 scores: i) speech quality (SIG), ii) background noise quality (BAK), and iii) the overall quality (OVRL) of the audio. The developed metric is highly correlated with human ratings, with a Pearson's Correlation Coefficient (PCC)=0.94 for SIG and PCC=0.98 for BAK and OVRL. This is the first non-intrusive P.835 predictor we are aware of. DNSMOS P.835 is made publicly available as an Azure service.

Chandan K A Reddy, Vishak Gopal, Ross Cutler• 2021

Related benchmarks

TaskDatasetResultRank
Preference EvaluationURGENT25-SQA
Acc@0.552
15
Preference EvaluationSOMOS
Acc@0.549
15
Preference EvaluationURGENT SQA 24
Acc@0.554
15
Preference EvaluationNISQA-FOR
Acc@0.547
15
Preference EvaluationCHiME UDASE 7 (test)
Acc@0.537
15
Preference EvaluationSpeechEval
Acc@0.552
15
Preference EvaluationNISQA-P501
Acc@0.546
15
Preference EvaluationTMHINT-QI
Acc@0.543
15
Preference EvaluationSpeechJudge
Acc@0.57
15
Speech Quality AssessmentBC 19
LCC0.48
12
Showing 10 of 19 rows

Other info

Follow for update