Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

TorchAudio-Squim: Reference-less Speech Quality and Intelligibility measures in TorchAudio

About

Measuring quality and intelligibility of a speech signal is usually a critical step in development of speech processing systems. To enable this, a variety of metrics to measure quality and intelligibility under different assumptions have been developed. Through this paper, we introduce tools and a set of models to estimate such known metrics using deep neural networks. These models are made available in the well-established TorchAudio library, the core audio and speech processing library within the PyTorch deep learning framework. We refer to it as TorchAudio-Squim, TorchAudio-Speech QUality and Intelligibility Measures. More specifically, in the current version of TorchAudio-squim, we establish and release models for estimating PESQ, STOI and SI-SDR among objective metrics and MOS among subjective metrics. We develop a novel approach for objective metric estimation and use a recently developed approach for subjective metric estimation. These models operate in a ``reference-less" manner, that is they do not require the corresponding clean speech as reference for speech assessment. Given the unavailability of clean speech and the effortful process of subjective evaluation in real-world situations, such easy-to-use tools would greatly benefit speech processing research and development.

Anurag Kumar, Ke Tan, Zhaoheng Ni, Pranay Manocha, Xiaohui Zhang, Ethan Henderson, Buye Xu• 2023

Related benchmarks

TaskDatasetResultRank
Speech Quality AssessmentTCD VOIP
SC87
5
Speech Quality AssessmentTENCENT
RMSE0.42
5
Speech Quality AssessmentTENCENT
SC0.8
5
Speech Quality AssessmentP23 EXP1
SC0.84
5
Speech Quality AssessmentNISQA (test)
Quality Score (SC)0.74
5
Speech Quality AssessmentNISQA P501 (test)
SC0.88
5
Speech Quality AssessmentVoiceMOS 1 (test)
SC0.71
5
Speech Quality AssessmentVoiceMOS 2 (test)
SC Score0.62
5
Speech Quality AssessmentNOIZEUS
SC Score0.72
5
Speech Quality AssessmentNISQA LT (test)
SC Score0.59
5
Showing 10 of 20 rows

Other info

Follow for update