📄 Abstract
Abstract: The emergence of agentic reinforcement learning (Agentic RL) marks a paradigm
shift from conventional reinforcement learning applied to large language models
(LLM RL), reframing LLMs from passive sequence generators into autonomous,
decision-making agents embedded in complex, dynamic worlds. This survey
formalizes this conceptual shift by contrasting the degenerate single-step
Markov Decision Processes (MDPs) of LLM-RL with the temporally extended,
partially observable Markov decision processes (POMDPs) that define Agentic RL.
Building on this foundation, we propose a comprehensive twofold taxonomy: one
organized around core agentic capabilities, including planning, tool use,
memory, reasoning, self-improvement, and perception, and the other around their
applications across diverse task domains. Central to our thesis is that
reinforcement learning serves as the critical mechanism for transforming these
capabilities from static, heuristic modules into adaptive, robust agentic
behavior. To support and accelerate future research, we consolidate the
landscape of open-source environments, benchmarks, and frameworks into a
practical compendium. By synthesizing over five hundred recent works, this
survey charts the contours of this rapidly evolving field and highlights the
opportunities and challenges that will shape the development of scalable,
general-purpose AI agents.
Authors (26)
Guibin Zhang
Hejia Geng
Xiaohang Yu
Zhenfei Yin
Zaibin Zhang
Zelin Tan
+20 more
Submitted
September 2, 2025
Key Contributions
This survey formalizes the paradigm shift from conventional LLM RL to Agentic RL, reframing LLMs as autonomous decision-making agents in complex worlds. It contrasts single-step MDPs with temporally extended POMDPs and proposes a comprehensive taxonomy of core agentic capabilities (planning, tool use, memory, reasoning, self-improvement, perception) and their applications, highlighting RL as the critical mechanism for adaptive agentic behavior.
Business Value
Guides the development of more sophisticated AI agents that can tackle complex real-world problems, leading to advancements in automation, robotics, and intelligent systems.