Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

HuPER: A Human-Inspired Framework for Phonetic Perception

About

We propose HuPER, a human-inspired framework that models phonetic perception as adaptive inference over acoustic-phonetics evidence and linguistic knowledge. With only 100 hours of training data, HuPER achieves state-of-the-art phonetic error rates on five English benchmarks and strong zero-shot transfer to 95 unseen languages. HuPER is also the first framework to enable adaptive, multi-path phonetic perception under diverse acoustic conditions. All training data, models, and code are open-sourced. Code and demo avaliable at https://github.com/HuPER29/HuPER.

Chenxu Guo, Jiachen Lian, Yisi Liu, Baihe Huang, Shriyaa Narayanan, Cheol Jun Cho, Gopala Anumanchipalli• 2026

Related benchmarks

TaskDatasetResultRank
Phone Feature RecognitionBuckeye (sociophonetic)
PFER7.36
25
Phone recognitionPRiSM Accented English Datasets
PFER (Timing)8.3
12
Phone recognitionPRiSM Multilingual Datasets
PFER (DRC)32
12
Phonetic PerceptionDRC-SE (DoReCo South-England)
PFER0.0908
8
Phonetic PerceptionL2-ARCTIC
PFER8
8
Phonetic PerceptionSO762 (SpeechOcean762)
PFER9
8
Phonetic PerceptionEpaDB
PFER0.1066
8
Showing 7 of 7 rows

Other info

Follow for update