Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
This paper introduces SPIRIT, a post-hoc patching defense for Speech Language Models (SLMs) against jailbreak attacks. It intervenes during inference by modifying activations, achieving up to 99% robustness with negligible utility impact and without retraining, addressing the vulnerability of SLMs to adversarial speech inputs.
Enhances the security and trustworthiness of voice-enabled AI systems, crucial for widespread adoption in sensitive applications like customer service, personal assistants, and secure communication.