Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_cl 94% Match Research Paper AI Researchers,Cognitive Scientists,ML Engineers,Philosophers of AI 2 weeks ago

The Mechanistic Emergence of Symbol Grounding in Language Models

large-language-models › model-architecture
📄 Abstract

Abstract: Symbol grounding (Harnad, 1990) describes how symbols such as words acquire their meanings by connecting to real-world sensorimotor experiences. Recent work has shown preliminary evidence that grounding may emerge in (vision-)language models trained at scale without using explicit grounding objectives. Yet, the specific loci of this emergence and the mechanisms that drive it remain largely unexplored. To address this problem, we introduce a controlled evaluation framework that systematically traces how symbol grounding arises within the internal computations through mechanistic and causal analysis. Our findings show that grounding concentrates in middle-layer computations and is implemented through the aggregate mechanism, where attention heads aggregate the environmental ground to support the prediction of linguistic forms. This phenomenon replicates in multimodal dialogue and across architectures (Transformers and state-space models), but not in unidirectional LSTMs. Our results provide behavioral and mechanistic evidence that symbol grounding can emerge in language models, with practical implications for predicting and potentially controlling the reliability of generation.

Key Contributions

Investigates the mechanistic emergence of symbol grounding in large-scale models using a controlled evaluation framework and causal analysis. It pinpoints grounding's concentration in middle-layer computations, implemented via an 'aggregate mechanism' where attention heads link environmental input to linguistic output, and shows this replicates across Transformers and state-space models but not LSTMs.

Business Value

Deepens the fundamental understanding of how AI models acquire meaning, which is crucial for building more robust, interpretable, and trustworthy AI systems, especially in multimodal applications.