Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ai 92% Match Research Paper Speech Researchers,NLP Engineers,Machine Translation Experts,TTS Developers 3 weeks ago

StressTransfer: Stress-Aware Speech-to-Speech Translation with Emphasis Preservation

speech-audio › text-to-speech
📄 Abstract

Abstract: We propose a stress-aware speech-to-speech translation (S2ST) system that preserves word-level emphasis by leveraging LLMs for cross-lingual emphasis conversion. Our method translates source-language stress into target-language tags that guide a controllable TTS model. To overcome data scarcity, we developed a pipeline to automatically generate aligned training data and introduce the "LLM-as-Judge" for evaluation. Experiments show our approach substantially outperforms baselines in preserving emphasis while maintaining comparable translation quality, speaker intent, and naturalness. Our work highlights the importance of prosody in translation and provides an effective, data-efficient solution for preserving paralinguistic cues in S2ST.
Authors (3)
Xi Chen
Yuchen Song
Satoshi Nakamura
Submitted
October 15, 2025
arXiv Category
cs.CL
arXiv PDF

Key Contributions

This paper proposes a novel stress-aware speech-to-speech translation (S2ST) system that preserves word-level emphasis by leveraging LLMs for cross-lingual emphasis conversion. It introduces a data generation pipeline and 'LLM-as-Judge' evaluation to overcome data scarcity, achieving superior emphasis preservation.

Business Value

Enhances the expressiveness and emotional nuance of translated speech, improving cross-cultural communication and user experience in applications like virtual assistants and international calls.