Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: We introduce AnyEnhance, a unified generative model for voice enhancement
that processes both speech and singing voices. Based on a masked generative
model, AnyEnhance is capable of handling both speech and singing voices,
supporting a wide range of enhancement tasks including denoising,
dereverberation, declipping, super-resolution, and target speaker extraction,
all simultaneously and without fine-tuning. AnyEnhance introduces a
prompt-guidance mechanism for in-context learning, which allows the model to
natively accept a reference speaker's timbre. In this way, it could boost
enhancement performance when a reference audio is available and enable the
target speaker extraction task without altering the underlying architecture.
Moreover, we also introduce a self-critic mechanism into the generative process
for masked generative models, yielding higher-quality outputs through iterative
self-assessment and refinement. Extensive experiments on various enhancement
tasks demonstrate AnyEnhance outperforms existing methods in terms of both
objective metrics and subjective listening tests. Demo audios are publicly
available at https://amphionspace.github.io/anyenhance. An open-source
implementation is provided at
https://github.com/viewfinder-annn/anyenhance-v1-ccf-aatc.
Authors (8)
Junan Zhang
Jing Yang
Zihao Fang
Yuancheng Wang
Zehua Zhang
Zhuo Wang
+2 more
Submitted
January 26, 2025
Key Contributions
AnyEnhance is a unified generative model for voice enhancement that handles both speech and singing voices across multiple tasks (denoising, dereverberation, super-resolution, speaker extraction) without fine-tuning. It introduces prompt-guidance for timbre transfer and a self-critic mechanism for iterative refinement, leading to higher-quality outputs.
Business Value
Enables high-quality, versatile voice enhancement for applications like virtual assistants, content creation, and communication tools, improving user experience.