Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SCOREQ: Speech Quality Assessment with Contrastive Regression

About

In this paper, we present SCOREQ, a novel approach for speech quality prediction. SCOREQ is a triplet loss function for contrastive regression that addresses the domain generalisation shortcoming exhibited by state of the art no-reference speech quality metrics. In the paper we: (i) illustrate the problem of L2 loss training failing at capturing the continuous nature of the mean opinion score (MOS) labels; (ii) demonstrate the lack of generalisation through a benchmarking evaluation across several speech domains; (iii) outline our approach and explore the impact of the architectural design decisions through incremental evaluation; (iv) evaluate the final model against state of the art models for a wide variety of data and domains. The results show that the lack of generalisation observed in state of the art speech quality metrics is addressed by SCOREQ. We conclude that using a triplet loss function for contrastive regression improves generalisation for speech quality prediction models but also has potential utility across a wide range of applications using regression-based predictive models.

Alessandro Ragano, Jan Skoglund, Andrew Hines• 2024

Related benchmarks

TaskDatasetResultRank
Preference EvaluationNISQA-P501
Acc@0.581
15
Preference EvaluationNISQA-FOR
Acc@0.579
15
Preference EvaluationURGENT SQA 24
Acc@0.558
15
Preference EvaluationCHiME UDASE 7 (test)
Acc@0.560
15
Preference EvaluationURGENT25-SQA
Acc@0.557
15
Preference EvaluationSOMOS
Acc@0.551
15
Preference EvaluationTMHINT-QI
Acc@0.548
15
Preference EvaluationSpeechEval
Acc@0.565
15
Preference EvaluationSpeechJudge
Acc@0.512
15
Speech Quality AssessmentNISQA-P501
LCC0.93
12
Showing 10 of 39 rows

Other info

Code

Follow for update