Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Large Language Models (LLMs) have demonstrated impressive reasoning
capabilities but continue to struggle with arithmetic tasks. Prior works
largely focus on outputs or prompting strategies, leaving the open question of
the internal structure through which models do arithmetic computation. In this
work, we investigate whether LLMs encode operator precedence in their internal
representations via the open-source instruction-tuned LLaMA 3.2-3B model. We
constructed a dataset of arithmetic expressions with three operands and two
operators, varying the order and placement of parentheses. Using this dataset,
we trace whether intermediate results appear in the residual stream of the
instruction-tuned LLaMA 3.2-3B model. We apply interpretability techniques such
as logit lens, linear classification probes, and UMAP geometric visualization.
Our results show that intermediate computations are present in the residual
stream, particularly after MLP blocks. We also find that the model linearly
encodes precedence in each operator's embeddings post attention layer. We
introduce partial embedding swap, a technique that modifies operator precedence
by exchanging high-impact embedding dimensions between operators.
Authors (7)
Dharunish Yugeswardeenoo
Harshil Nukala
Ved Shah
Cole Blondin
Sean O Brien
Vasu Sharma
+1 more
Submitted
October 14, 2025
Key Contributions
Investigates the internal representation of operator precedence in LLMs using LLaMA 3.2-3B. By tracing intermediate computations in the residual stream via interpretability techniques (logit lens, probes, UMAP), it shows that models encode arithmetic operations, particularly after MLP blocks, shedding light on their reasoning mechanisms.
Business Value
Enhances trust and reliability in LLMs by providing insights into their reasoning processes, crucial for high-stakes applications like finance or scientific modeling.