Lab Lunch: 17 December 2019 - Richard Mayr

Abstract:

Given a countable Markov decision process (MDP) and a set of goal states, how can one visit the goal states infinitely often, with high probability? Optimal strategies need not exist.

How much memory do epsilon-optimal strategies need? In 1979 T.P. Hill (a student of Lester Dubbins, of Dubbins & Savage "How to Gamble If You Must") proved that Markov strategies (i.e., just using a clock) suffice if the number of controlled states is finite. The general case remained open until 2019, and has a surprising solution. The strategies need exactly a clock and 1 extra bit of memory.

Reference:

Stefan Kiefer, Richard Mayr, Mahsa Shirmohammadi, Patrick Totzke. Büchi Objectives in Countable MDPs. ICALP 2019, LIPIcs, Vol. 132. Athens, Greece. 2019. http://arxiv.org/abs/1904.11573

Dec 17 2019 13.00 - 14.00

Lab Lunch: 17 December 2019 - Richard Mayr

Speaker: Richard Mayr

MF2 level 4

This article was published on 16 Dec, 2019