Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

EdgeSpot: Efficient and High-Performance Few-Shot Model for Keyword Spotting

About

We introduce an efficient few-shot keyword spotting model for edge devices, EdgeSpot, that pairs an optimized version of a BC-ResNet-based acoustic backbone with a trainable Per-Channel Energy Normalization frontend and lightweight temporal self-attention. Knowledge distillation is utilized during training by employing a self-supervised teacher model, optimized with Sub-center ArcFace loss. This study demonstrates that the EdgeSpot model consistently provides better accuracy at a fixed false-alarm rate (FAR) than strong BC-ResNet baselines. The largest variant, EdgeSpot-4, improves the 10-shot accuracy at 1% FAR from 73.7% to 82.0%, which requires only 29.4M MACs with 128k parameters.

Oguzhan Buyuksolak, Alican Gok, Osman Erman Okman• 2026

Related benchmarks

TaskDatasetResultRank
Keyword SpottingGSC
Top-1 Accuracy82
22
Keyword SpottingMSWC (test)
DET@1%95.7
20
Showing 2 of 2 rows

Other info

Follow for update