Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ml 95% Match Research Paper LLM Researchers,NLP Engineers,AI Developers,Machine Learning Scientists 20 hours ago

Repetitions are not all alike: distinct mechanisms sustain repetition in language models

large-language-models › reasoning
📄 Abstract

Abstract: Large Language Models (LLMs) can sometimes degrade into repetitive loops, persistently generating identical word sequences. Because repetition is rare in natural human language, its frequent occurrence across diverse tasks and contexts in LLMs remains puzzling. Here we investigate whether behaviorally similar repetition patterns arise from distinct underlying mechanisms and how these mechanisms develop during model training. We contrast two conditions: repetitions elicited by natural text prompts with those induced by in-context learning (ICL) setups that explicitly require copying behavior. Our analyses reveal that ICL-induced repetition relies on a dedicated network of attention heads that progressively specialize over training, whereas naturally occurring repetition emerges early and lacks a defined circuitry. Attention inspection further shows that natural repetition focuses disproportionately on low-information tokens, suggesting a fallback behavior when relevant context cannot be retrieved. These results indicate that superficially similar repetition behaviors originate from qualitatively different internal processes, reflecting distinct modes of failure and adaptation in language models.

Key Contributions

This paper investigates the distinct underlying mechanisms that cause repetition in LLMs, differentiating between natural text prompts and in-context learning (ICL) setups. It reveals that ICL-induced repetition involves specialized attention heads, while natural repetition emerges early without specific circuitry and focuses on low-information tokens, suggesting a fallback behavior.

Business Value

Improved understanding of LLM limitations can lead to more robust and reliable AI text generation systems, reducing undesirable outputs in applications like content creation, chatbots, and code generation.