Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_cl 93% Match Research Paper Speech scientists,Bioengineers,ML researchers,Assistive technology developers,Neuroscientists 1 week ago

emg2speech: synthesizing speech from electromyography using self-supervised speech models

speech-audio › text-to-speech
📄 Abstract

Abstract: We present a neuromuscular speech interface that translates electromyographic (EMG) signals collected from orofacial muscles during speech articulation directly into audio. We show that self-supervised speech (SS) representations exhibit a strong linear relationship with the electrical power of muscle action potentials: SS features can be linearly mapped to EMG power with a correlation of $r = 0.85$. Moreover, EMG power vectors corresponding to different articulatory gestures form structured and separable clusters in feature space. This relationship: $\text{SS features}$ $\xrightarrow{\texttt{linear mapping}}$ $\text{EMG power}$ $\xrightarrow{\texttt{gesture-specific clustering}}$ $\text{articulatory movements}$, highlights that SS models implicitly encode articulatory mechanisms. Leveraging this property, we directly map EMG signals to SS feature space and synthesize speech, enabling end-to-end EMG-to-speech generation without explicit articulatory models and vocoder training.
Authors (2)
Harshavardhana T. Gowda
Lee M. Miller
Submitted
October 28, 2025
arXiv Category
cs.SD
arXiv PDF

Key Contributions

This paper presents a neuromuscular speech interface that synthesizes speech directly from electromyographic (EMG) signals of orofacial muscles. It demonstrates a strong linear relationship between self-supervised speech (SS) representations and EMG power, enabling direct mapping of EMG to SS features for end-to-end speech synthesis. This approach bypasses explicit articulatory models and vocoders, offering a novel pathway for communication aids.

Business Value

Enabling individuals who have lost the ability to speak due to motor impairments to communicate effectively through synthesized speech can dramatically improve their quality of life and social integration, opening new markets for assistive communication technologies.