Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
UniSE proposes a unified framework for various speech enhancement tasks using a decoder-only autoregressive language model. It demonstrates that LMs can effectively handle diverse SE sub-tasks by generating discrete tokens conditioned on input speech features, showing competitive performance against specialized baselines.
Enables more versatile and potentially higher-quality audio processing solutions for applications like voice assistants, call center noise reduction, and audio editing, by using a single model for multiple tasks.