Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Constrained optimization provides a common framework for dealing with
conflicting objectives in reinforcement learning (RL). In most of these
settings, the objectives (and constraints) are expressed though the expected
accumulated reward. However, this formulation neglects risky or even possibly
catastrophic events at the tails of the reward distribution, and is often
insufficient for high-stakes applications in which the risk involved in
outliers is critical. In this work, we propose a framework for risk-aware
constrained RL, which exhibits per-stage robustness properties jointly in
reward values and time using optimized certainty equivalents (OCEs). Our
framework ensures an exact equivalent to the original constrained problem
within a parameterized strong Lagrangian duality framework under appropriate
constraint qualifications, and yields a simple algorithmic recipe which can be
wrapped around standard RL solvers, such as PPO. Lastly, we establish the
convergence of the proposed algorithm under common assumptions, and verify the
risk-aware properties of our approach through several numerical experiments.
Authors (5)
Jane H. Lee
Baturay Saglam
Spyridon Pougkakiotis
Amin Karbasi
Dionysis Kalogerias
Submitted
October 23, 2025
Key Contributions
Proposes a risk-aware constrained RL framework using Optimized Certainty Equivalents (OCEs) that ensures per-stage robustness in reward and time. The framework is equivalent to the original constrained problem under strong duality and can be easily integrated with standard RL solvers like PPO.
Business Value
Enables the development of safer and more reliable autonomous systems (e.g., self-driving cars, industrial robots) and financial trading algorithms by explicitly managing risks and uncertainties.