Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PCOV-KWS: Multi-task Learning for Personalized Customizable Open Vocabulary Keyword Spotting

About

As advancements in technologies like Internet of Things (IoT), Automatic Speech Recognition (ASR), Speaker Verification (SV), and Text-to-Speech (TTS) lead to increased usage of intelligent voice assistants, the demand for privacy and personalization has escalated. In this paper, we introduce a multi-task learning framework for personalized, customizable open-vocabulary Keyword Spotting (PCOV-KWS). This framework employs a lightweight network to simultaneously perform Keyword Spotting (KWS) and SV to address personalized KWS requirements. We have integrated a training criterion distinct from softmax-based loss, transforming multi-class classification into multiple binary classifications, which eliminates inter-category competition, while an optimization strategy for multi-task loss weighting is employed during training. We evaluated our PCOV-KWS system in multiple datasets, demonstrating that it outperforms the baselines in evaluation results, while also requiring fewer parameters and lower computational resources.

Jianan Pan, Kejie Huang• 2026

Related benchmarks

TaskDatasetResultRank
Keyword SpottingGoogle Speech Commands (test)
Accuracy96.9
71
Open-vocabulary keyword spottingLibriPhrase easy
EER0.0132
11
Open-vocabulary keyword spottingLibriPhrase LPH
AUC89.46
5
Showing 3 of 3 rows

Other info

Follow for update