Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ai 95% Match Research Paper LLM researchers,AI/ML engineers,Reinforcement learning practitioners,AI safety researchers 1 week ago

Reasoning Curriculum: Bootstrapping Broad LLM Reasoning from Math

large-language-models › training-methods
📄 Abstract

Abstract: Reinforcement learning (RL) can elicit strong reasoning in large language models (LLMs), yet most open efforts focus on math and code. We propose Reasoning Curriculum, a simple two-stage curriculum that first elicits reasoning skills in pretraining-aligned domains such as math, then adapts and refines these skills across other domains via joint RL. Stage 1 performs a brief cold start and then math-only RL with verifiable rewards to develop reasoning skills. Stage 2 runs joint RL on mixed-domain data to transfer and consolidate these skills. The curriculum is minimal and backbone-agnostic, requiring no specialized reward models beyond standard verifiability checks. Evaluated on Qwen3-4B and Llama-3.1-8B over a multi-domain suite, reasoning curriculum yields consistent gains. Ablations and a cognitive-skill analysis indicate that both stages are necessary and that math-first elicitation increases cognitive behaviors important for solving complex problems. Reasoning Curriculum provides a compact, easy-to-adopt recipe for general reasoning.
Authors (5)
Bo Pang
Deqian Kong
Silvio Savarese
Caiming Xiong
Yingbo Zhou
Submitted
October 30, 2025
arXiv Category
cs.AI
arXiv PDF

Key Contributions

This paper proposes 'Reasoning Curriculum,' a two-stage RL curriculum that first elicits reasoning skills in math using verifiable rewards, then transfers and refines these skills across other domains. This minimal, backbone-agnostic approach shows consistent gains and improves cognitive behaviors crucial for complex problem-solving.

Business Value

Enables the development of more capable and versatile LLMs that can tackle a wider range of complex reasoning tasks, improving AI applications in various fields.