Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: We propose Schr\"odinger Bridge Mamba (SBM), a new concept of
training-inference framework motivated by the inherent compatibility between
Schr\"odinger Bridge (SB) training paradigm and selective state-space model
Mamba. We exemplify the concept of SBM with an implementation for generative
speech enhancement. Experiments on a joint denoising and dereverberation task
using four benchmark datasets demonstrate that SBM, with only 1-step inference,
outperforms strong baselines with 1-step or iterative inference and achieves
the best real-time factor (RTF). Beyond speech enhancement, we discuss the
integration of SB paradigm and selective state-space model architecture based
on their underlying alignment, which indicates a promising direction for
exploring new deep generative models potentially applicable to a broad range of
generative tasks. Demo page: https://sbmse.github.io
Authors (4)
Jing Yang
Sirui Wang
Chao Wu
Fan Fan
Submitted
October 19, 2025
Key Contributions
This paper introduces Schrödinger Bridge Mamba (SBM), a novel training-inference framework that synergizes the Schrödinger Bridge paradigm with the Mamba architecture for one-step speech enhancement. SBM achieves state-of-the-art performance in denoising and dereverberation with significantly faster inference (best real-time factor) compared to existing methods, while also showing potential for broader generative tasks.
Business Value
Enables real-time, high-quality audio processing for applications like voice calls, virtual meetings, and voice assistants, improving user experience and enabling new real-time audio manipulation capabilities.