Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Large language models achieve impressive results but distinguishing factual
reasoning from hallucinations remains challenging. We propose a spectral
analysis framework that models transformer layers as dynamic graphs induced by
attention, with token embeddings as signals on these graphs. Through graph
signal processing, we define diagnostics including Dirichlet energy, spectral
entropy, and high-frequency energy ratios, with theoretical connections to
computational stability. Experiments across GPT architectures suggest universal
spectral patterns: factual statements exhibit consistent "energy mountain"
behavior with low-frequency convergence, while different hallucination types
show distinct signatures. Logical contradictions destabilize spectra with large
effect sizes ($g>1.0$), semantic errors remain stable but show connectivity
drift, and substitution hallucinations display intermediate perturbations. A
simple detector using spectral signatures achieves 88.75% accuracy versus 75%
for perplexity-based baselines, demonstrating practical utility. These findings
indicate that spectral geometry may capture reasoning patterns and error
behaviors, potentially offering a framework for hallucination detection in
large language models.
Submitted
October 21, 2025
Key Contributions
Proposes a spectral analysis framework using graph signal processing on transformer attention graphs to detect LLM hallucinations. Identifies distinct spectral signatures (e.g., 'energy mountain' behavior) for factual statements versus different types of hallucinations, enabling a simple detector with high accuracy.
Business Value
Enhances the trustworthiness of LLMs by providing a method to detect and mitigate hallucinations, crucial for applications requiring factual accuracy, such as customer service, content generation, and information retrieval.