Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ai 95% Match Research Paper AI Researchers,LLM Developers,AI Safety Experts,AI Agent Engineers 1 week ago

The Reasoning Trap: How Enhancing LLM Reasoning Amplifies Tool Hallucination

large-language-models › reasoning
📄 Abstract

Abstract: Enhancing the reasoning capabilities of Large Language Models (LLMs) is a key strategy for building Agents that "think then act." However, recent observations, like OpenAI's o3, suggest a paradox: stronger reasoning often coincides with increased hallucination, yet no prior work has systematically examined whether reasoning enhancement itself causes tool hallucination. To address this gap, we pose the central question: Does strengthening reasoning increase tool hallucination? To answer this, we introduce SimpleToolHalluBench, a diagnostic benchmark measuring tool hallucination in two failure modes: (i) no tool available, and (ii) only distractor tools available. Through controlled experiments, we establish three key findings. First, we demonstrate a causal relationship: progressively enhancing reasoning through RL increases tool hallucination proportionally with task performance gains. Second, this effect transcends overfitting - training on non-tool tasks (e.g., mathematics) still amplifies subsequent tool hallucination. Third, the effect is method-agnostic, appearing when reasoning is instilled via supervised fine-tuning and when it is merely elicited at inference by switching from direct answers to step-by-step thinking. We also evaluate mitigation strategies including Prompt Engineering and Direct Preference Optimization (DPO), revealing a fundamental reliability-capability trade-off: reducing hallucination consistently degrades utility. Mechanistically, Reasoning RL disproportionately collapses tool-reliability-related representations, and hallucinations surface as amplified divergences concentrated in late-layer residual streams. These findings reveal that current reasoning enhancement methods inherently amplify tool hallucination, highlighting the need for new training objectives that jointly optimize for capability and reliability.
Authors (4)
Chenlong Yin
Zeyang Sha
Shiwen Cui
Changhua Meng
Submitted
October 27, 2025
arXiv Category
cs.LG
arXiv PDF

Key Contributions

Establishes a causal relationship: enhancing LLM reasoning capabilities proportionally increases tool hallucination. Introduces SimpleToolHalluBench, a diagnostic benchmark to systematically measure this phenomenon.

Business Value

Crucial for building trustworthy AI agents that can reliably use external tools without introducing errors or unintended consequences, impacting automation and decision-making systems.