Sage Journals: Discover world-class research

Abstract

We prove that given a Markov Decision Process (MDP) and a fixed subset of its states~F, there is a Markov policy which maximizes everywhere the probability to reach F infinitely often. Moreover such a maximum policy is computable in polytime in the size of the MDP. This result can be applied in order to control a system with randomized or uncertain behavior with respect to a given property to optimize.

Keywords

Markov Decision Processes Büchi automata performance evaluation

Get full access to this article

View all access options for this article.