Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ml 70% Match Research Paper Reinforcement Learning Researchers,Robotics Engineers,Healthcare AI Developers 3 weeks ago

Active Measuring in Reinforcement Learning With Delayed Negative Effects

reinforcement-learning › robotics-rl
📄 Abstract

Abstract: Measuring states in reinforcement learning (RL) can be costly in real-world settings and may negatively influence future outcomes. We introduce the Actively Observable Markov Decision Process (AOMDP), where an agent not only selects control actions but also decides whether to measure the latent state. The measurement action reveals the true latent state but may have a negative delayed effect on the environment. We show that this reduced uncertainty may provably improve sample efficiency and increase the value of the optimal policy despite these costs. We formulate an AOMDP as a periodic partially observable MDP and propose an online RL algorithm based on belief states. To approximate the belief states, we further propose a sequential Monte Carlo method to jointly approximate the posterior of unknown static environment parameters and unobserved latent states. We evaluate the proposed algorithm in a digital health application, where the agent decides when to deliver digital interventions and when to assess users' health status through surveys.
Authors (5)
Daiqi Gao
Ziping Xu
Aseel Rawashdeh
Predrag Klasnja
Susan A. Murphy
Submitted
October 16, 2025
arXiv Category
cs.LG
arXiv PDF

Key Contributions

Introduces the Actively Observable Markov Decision Process (AOMDP) where agents can choose to measure the state, which reveals it but incurs delayed negative effects. Proposes an online RL algorithm using belief states and sequential Monte Carlo for approximation, showing provable improvements in sample efficiency and policy value despite measurement costs.

Business Value

Enables more efficient and effective decision-making in real-world scenarios where observing the state is costly or has side effects, such as in healthcare or industrial control systems.