A continuity result for optimal memoryless planning in POMDPs
Johannes Rauh, Nihat Ay, and Guido Montúfar
Contact the author: Please use for correspondence this email.
Submission date: 09. Mar. 2021
Keywords and phrases: POMDPs, memoryless stochastic policy, optimal policy
Download full preprint: PDF (214 kB)
Consider an infinite horizon partially observable Markov decision process. We show that the optimal discounted reward under memoryless stochastic policies is continuous under perturbations of the observation channel. This implies that we can find approximately optimal memoryless policies by solving an approximate problem with a simpler observation channel.