Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
π Abstract
Abstract: Speech summarization has become an essential tool for efficiently managing
and accessing the growing volume of spoken and audiovisual content. However,
despite its increasing importance, speech summarization remains loosely
defined. The field intersects with several research areas, including speech
recognition, text summarization, and specific applications like meeting
summarization. This survey not only examines existing datasets and evaluation
protocols, which are crucial for assessing the quality of summarization
approaches, but also synthesizes recent developments in the field, highlighting
the shift from traditional systems to advanced models like fine-tuned cascaded
architectures and end-to-end solutions. In doing so, we surface the ongoing
challenges, such as the need for realistic evaluation benchmarks, multilingual
datasets, and long-context handling.
Authors (7)
Fabian Retkowski
Maike ZΓΌfle
Andreas Sudmann
Dinah Pfau
Shinji Watanabe
Jan Niehues
+1 more
Key Contributions
Provides a comprehensive survey of speech summarization, examining existing datasets, evaluation protocols, and recent advancements. It highlights the shift towards cascaded and end-to-end models and identifies ongoing challenges like multilingual support and long-context handling.
Business Value
Enables better understanding and development of tools for efficiently processing and extracting information from the vast amount of spoken and audiovisual content generated daily.