Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AnyEnhance: A Unified Generative Model with Prompt-Guidance and Self-Critic for Voice Enhancement

About

We introduce AnyEnhance, a unified generative model for voice enhancement that processes both speech and singing voices. Based on a masked generative model, AnyEnhance is capable of handling both speech and singing voices, supporting a wide range of enhancement tasks including denoising, dereverberation, declipping, super-resolution, and target speaker extraction, all simultaneously and without fine-tuning. AnyEnhance introduces a prompt-guidance mechanism for in-context learning, which allows the model to natively accept a reference speaker's timbre. In this way, it could boost enhancement performance when a reference audio is available and enable the target speaker extraction task without altering the underlying architecture. Moreover, we also introduce a self-critic mechanism into the generative process for masked generative models, yielding higher-quality outputs through iterative self-assessment and refinement. Extensive experiments on various enhancement tasks demonstrate AnyEnhance outperforms existing methods in terms of both objective metrics and subjective listening tests. Demo audios are publicly available at https://amphionspace.github.io/anyenhance. An open-source implementation is provided at https://github.com/viewfinder-annn/anyenhance-v1-ccf-aatc.

Junan Zhang, Jing Yang, Zihao Fang, Yuancheng Wang, Zehua Zhang, Zhuo Wang, Fan Fan, Zhizheng Wu• 2025

Related benchmarks

TaskDatasetResultRank
Speech EnhancementDNS Challenge Real Recordings (test)
SIG Score3.488
32
Speech EnhancementDNS Challenge With Reverb (test)
SIG3.5
24
Speech EnhancementDNS no-reverb 2020 (test)--
20
General Speech RestorationDNS-Real Out-Domain (test)
SIG3.488
17
Target Speaker ExtractionLibri2Mix Clean
DNSMOS OVL3.353
14
Noise SuppressionInterspeech DNS Challenge With Reverb 2020 (test)
SIG Score3.5
10
Noise SuppressionInterspeech DNS Challenge blind No Reverb 2020 (test)
SIG Score3.64
10
Speech EnhancementDNS Challenge no-reverb
DNSMOS3.406
9
Speech EnhancementDNS Challenge HardSet
DNSMOS3.384
8
General Speech RestorationVoicefixer-GSR In-Domain (test)
SIG3.406
7
Showing 10 of 13 rows

Other info

Follow for update