Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ml 95% Match Research Paper AI security researchers,Speech technology developers,Machine learning engineers,Cybersecurity professionals 2 weeks ago

Are Modern Speech Enhancement Systems Vulnerable to Adversarial Attacks?

ai-safety β€Ί robustness
πŸ“„ Abstract

Abstract: Machine learning approaches for speech enhancement are becoming increasingly expressive, enabling ever more powerful modifications of input signals. In this paper, we demonstrate that this expressiveness introduces a vulnerability: advanced speech enhancement models can be susceptible to adversarial attacks. Specifically, we show that adversarial noise, carefully crafted and psychoacoustically masked by the original input, can be injected such that the enhanced speech output conveys an entirely different semantic meaning. We experimentally verify that contemporary predictive speech enhancement models can indeed be manipulated in this way. Furthermore, we highlight that diffusion models with stochastic samplers exhibit inherent robustness to such adversarial attacks by design.
Authors (3)
Rostislav Makarov
Lea SchΓΆnherr
Timo Gerkmann
Submitted
September 25, 2025
arXiv Category
eess.AS
arXiv PDF

Key Contributions

This paper demonstrates that modern speech enhancement systems are vulnerable to adversarial attacks, where carefully crafted noise can alter the semantic meaning of the enhanced speech. It also highlights that diffusion models with stochastic samplers exhibit inherent robustness against such attacks.

Business Value

Crucial for building secure and trustworthy speech technologies, preventing malicious manipulation of audio content and ensuring the reliability of voice interfaces and communication systems.