Constrained Partially Observed Markov Decision Processes With Probabilistic Criteria for Adaptive Sequential Detection
Dynamic programming equations are derived which characterize the optimal value functions for a partially observed constrained Markov decision process problem with both total cost and probabilistic criteria. More specifically, the goal is to minimize an expected total cost subject to a constraint on the probability that another total cost exceeds a prescribed threshold. The Markov decision process is partially observed, but it is assumed that the constraint costs are available to the controller, i.e., they are fully observed. The problem is motivated by an adaptive sequential detection application. The application of the dynamic programming results to optimal adaptive truncated sequential detection is demonstrated using an example involving the optimization of a radar detection process.