arxiv_ai 95% Match Research Paper AI Researchers,LLM Developers,AI Safety Experts,AI Agent Engineers 1 week ago

The Reasoning Trap: How Enhancing LLM Reasoning Amplifies Tool Hallucination

large-language-models › reasoning

📄 Abstract

Abstract: Enhancing the reasoning capabilities of Large Language Models (LLMs) is a key strategy for building Agents that "think then act." However, recent observations, like OpenAI's o3, suggest a paradox: stronger reasoning often coincides with increased hallucination, yet no prior work has systematically examined whether reasoning enhancement itself causes tool hallucination. To address this gap, we pose the central question: Does strengthening reasoning increase tool hallucination? To answer this, we introduce SimpleToolHalluBench, a diagnostic benchmark measuring tool hallucination in two failure modes: (i) no tool available, and (ii) only distractor tools available. Through controlled experiments, we establish three key findings. First, we demonstrate a causal relationship: progressively enhancing reasoning through RL increases tool hallucination proportionally with task performance gains. Second, this effect transcends overfitting - training on non-tool tasks (e.g., mathematics) still amplifies subsequent tool hallucination. Third, the effect is method-agnostic, appearing when reasoning is instilled via supervised fine-tuning and when it is merely elicited at inference by switching from direct answers to step-by-step thinking. We also evaluate mitigation strategies including Prompt Engineering and Direct Preference Optimization (DPO), revealing a fundamental reliability-capability trade-off: reducing hallucination consistently degrades utility. Mechanistically, Reasoning RL disproportionately collapses tool-reliability-related representations, and hallucinations surface as amplified divergences concentrated in late-layer residual streams. These findings reveal that current reasoning enhancement methods inherently amplify tool hallucination, highlighting the need for new training objectives that jointly optimize for capability and reliability.

Authors (4)

Chenlong Yin

Zeyang Sha

Shiwen Cui

Changhua Meng

Submitted

October 27, 2025

arXiv Category

cs.LG

arXiv PDF

Key Contributions

Establishes a causal relationship: enhancing LLM reasoning capabilities proportionally increases tool hallucination. Introduces SimpleToolHalluBench, a diagnostic benchmark to systematically measure this phenomenon.

Business Value

Crucial for building trustworthy AI agents that can reliably use external tools without introducing errors or unintended consequences, impacting automation and decision-making systems.

Paper Metadata

Innovation Type

Identification of a Paradoxical Effect

Deployment Feasibility

High for researchers studying the problem, moderate for developers needing to implement mitigation strategies.

Limitations Addressed

The paradox where strengthening LLM reasoning leads to increased tool hallucination. Addresses the lack of systematic investigation into this phenomenon.

Performance Gains

Demonstrates proportional increase in tool hallucination with reasoning enhancement.

Technical Tags

LLM reasoningtool usehallucinationreinforcement learningcausal relationshipdiagnostic benchmarktask performanceoverfittingagent behavior

Research Topics

LLM ReasoningAI AgentsAI SafetyTool UseMachine Learning Robustness

Methods & Architectures

controlled experimentsreinforcement learning (RL) enhancementdiagnostic benchmark (SimpleToolHalluBench)causal analysis Large Language Models (LLMs)AI Agents

Applications & Tasks

AI Agents Automated Systems Decision Support Tool HallucinationReasoning Enhancement ParadoxAI ReliabilityOverfitting Measuring tool hallucination in LLMsInvestigating the causal link between reasoning and tool hallucinationDeveloping more reliable AI agents

Datasets & Benchmarks

Benchmarks

SimpleToolHalluBench

tool hallucination ratetask performancereasoning enhancement level

Related Fields

Artificial IntelligenceMachine LearningAI SafetyReinforcement LearningNatural Language Processing

Keywords

LLM reasoningtool usehallucinationAI agentsreinforcement learningbenchmarksafetyreliabilityoverfittingcausal inferenceparadox

Academic Context

#LLM Reasoning#AI Agents#AI Safety#Tool Use#Machine Learning Robustness

Companies & Organizations

Companies Mentioned

OpenAI

Commercial Potential

Potential Products

Tools for diagnosing and mitigating LLM tool hallucinationSafer AI agent development frameworks

Target Industries

TechnologySoftware DevelopmentAutomationAI Services

Use Case Examples

Ensuring AI agents reliably use APIs without errorsDeveloping AI assistants that can safely interact with external tools

Competitive Edge

First work to systematically demonstrate and quantify the causal link between enhanced LLM reasoning and increased tool hallucination.

Market Opportunity

Rapid growth of AI agents necessitates robust safety and reliability measures.

Revenue Models

Consulting on AI safetylicensing of mitigation techniques.

Resource Requirements

Compute Needs

Moderate to High (for RL training and running experiments)

Data Requirements

Benchmark datasets designed to test tool use and hallucination scenarios.

Deployment Constraints

Requires careful tuning of reasoning capabilities to balance performance and reliability.

Scalability

The findings are generalizable to any LLM agent that utilizes external tools.

Regulatory Considerations

AI safety standardsresponsible AI development.

Production Readiness

Maturity Level

Research

Time to Market

1-2 years (for mitigation techniques)

Patent Potential

Low (identifying a problem is less patentable than a solution)

View Full Paper Back to Papers