Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
đ Abstract
Abstract: Reinforcement learning (RL) is increasingly applied to real-world problems
involving complex and structured decisions, such as routing, scheduling, and
assortment planning. These settings challenge standard RL algorithms, which
struggle to scale, generalize, and exploit structure in the presence of
combinatorial action spaces. We propose Structured Reinforcement Learning
(SRL), a novel actor-critic paradigm that embeds combinatorial
optimization-layers into the actor neural network. We enable end-to-end
learning of the actor via Fenchel-Young losses and provide a geometric
interpretation of SRL as a primal-dual algorithm in the dual of the moment
polytope. Across six environments with exogenous and endogenous uncertainty,
SRL matches or surpasses the performance of unstructured RL and imitation
learning on static tasks and improves over these baselines by up to 92% on
dynamic problems, with improved stability and convergence speed.
Authors (5)
Heiko Hoppe
LÊo Baty
Louis Bouvier
Axel Parmentier
Maximilian Schiffer
Key Contributions
This paper introduces Structured Reinforcement Learning (SRL), a novel actor-critic paradigm that embeds combinatorial optimization layers into the actor network. SRL enables end-to-end learning for problems with combinatorial action spaces, significantly outperforming unstructured RL and imitation learning on dynamic tasks by up to 92%, while also improving stability and convergence speed.
Business Value
Enables more efficient and effective decision-making in complex operational settings, leading to cost savings, improved resource utilization, and optimized logistics in industries like manufacturing, transportation, and e-commerce.