Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ai 90% Match Research Paper AI researchers,Robotics engineers,Operations research specialists,Game AI developers 1 week ago

Online POMDP Planning with Anytime Deterministic Optimality Guarantees

reinforcement-learning › robotics-rl
📄 Abstract

Abstract: Decision-making under uncertainty is a critical aspect of many practical autonomous systems due to incomplete information. Partially Observable Markov Decision Processes (POMDPs) offer a mathematically principled framework for formulating decision-making problems under such conditions. However, finding an optimal solution for a POMDP is generally intractable. In recent years, there has been a significant progress of scaling approximate solvers from small to moderately sized problems, using online tree search solvers. Often, such approximate solvers are limited to probabilistic or asymptotic guarantees towards the optimal solution. In this paper, we derive a deterministic relationship for discrete POMDPs between an approximated and the optimal solution. We show that at any time, we can derive bounds that relate between the existing solution and the optimal one. We show that our derivations provide an avenue for a new set of algorithms and can be attached to existing algorithms that have a certain structure to provide them with deterministic guarantees with marginal computational overhead. In return, not only do we certify the solution quality, but we demonstrate that making a decision based on the deterministic guarantee may result in superior performance compared to the original algorithm without the deterministic certification.
Authors (2)
Moran Barenboim
Vadim Indelman
Submitted
October 3, 2023
arXiv Category
cs.AI
arXiv PDF

Key Contributions

Derives a deterministic relationship for discrete POMDPs between an approximated solution and the optimal solution, providing bounds that can be derived at any time. This enables new algorithms and can be attached to existing online tree search solvers, offering anytime deterministic optimality guarantees.

Business Value

Enhances the reliability and trustworthiness of autonomous systems operating under uncertainty, crucial for safety-critical applications like autonomous driving and robotics.