Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ai 90% Match Research Paper Speech processing researchers,AI researchers,Audio engineers,Developers of voice applications 2 weeks ago

UniSE: A Unified Framework for Decoder-only Autoregressive LM-based Speech Enhancement

speech-audio › audio-generation
📄 Abstract

Abstract: The development of neural audio codecs (NACs) has largely promoted applications of language models (LMs) to speech processing and understanding. However, there lacks the verification on the effectiveness of autoregressive (AR) LMbased models in unifying different sub-tasks of speech enhancement (SE). In this work, we propose UniSE, a unified decoder-only LM-based framework to handle different SE tasks including speech restoration, target speaker extraction and speech separation. It takes input speech features as conditions and generates discrete tokens of the target speech using AR modeling, which facilitates a compatibility between distinct learning patterns of multiple tasks. Experiments on several benchmarks indicate the proposed UniSE can achieve competitive performance compared to discriminative and generative baselines, showing the capacity of LMs in unifying SE tasks. The demo page is available here: https://github.com/hyyan2k/UniSE.
Authors (5)
Haoyin Yan
Chengwei Liu
Shaofei Xue
Xiaotao Liang
Zheng Xue
Submitted
October 23, 2025
arXiv Category
cs.SD
arXiv PDF Code

Key Contributions

UniSE proposes a unified framework for various speech enhancement tasks using a decoder-only autoregressive language model. It demonstrates that LMs can effectively handle diverse SE sub-tasks by generating discrete tokens conditioned on input speech features, showing competitive performance against specialized baselines.

Business Value

Enables more versatile and potentially higher-quality audio processing solutions for applications like voice assistants, call center noise reduction, and audio editing, by using a single model for multiple tasks.

View Code on GitHub