Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: We present a novel framework for training large language models with
continuously adjustable internal representations that span the full spectrum
from localist (interpretable, rule-based) to distributed (generalizable,
efficient) encodings. The key innovations are (1) a locality dial, a tunable
parameter that dynamically controls the degree of localization during both
training and inference without requiring model retraining, (2) an
information-theoretic recruitment mechanism that adaptively allocates semantic
blocks as needed, eliminating the requirement for complete domain knowledge at
initialization, and (3) a hierarchical recruitment framework that extends
capacity allocation to entire specialized LLMs, enabling multi-granularity
architectural adaptation. This is achieved through group sparsity penalties on
attention mechanisms, information-theoretic anchor design, dynamic rule
injection, and principled recruitment criteria based on penalized likelihood
with explicit units. We provide rigorous mathematical results establishing
explicit threshold conditions under which attention provably concentrates on
semantically relevant blocks at stationary points, with exact bounds on
attention entropy and pointer fidelity. The hierarchical recruitment mechanism
provides convergence guarantees at both the block level (fine-grained,
within-LLM) and the LLM level (coarse-grained, cross-domain), ensuring the
system discovers semantic partitions that balance model complexity against data
encoding efficiency. This framework enables practitioners to continuously
interpolate between interpretable and high-performance modes while adapting
architectural capacity at multiple granularities, supporting applications in
regulated domains requiring both transparency and capability.
Submitted
October 20, 2025
Key Contributions
Introduces a framework for training LLMs with continuously adjustable internal representations from localist to distributed using a 'locality dial' and an 'information-theoretic recruitment mechanism'. This allows dynamic control over model behavior and capacity allocation without retraining.
Business Value
Enables the development of more transparent and adaptable AI systems, which can be crucial for regulated industries or applications requiring explainability, while maintaining high performance.