Skip to main content

A Logic for Reinforcement Learning Systems

Primary supervisor

Julian Gutierrez Santiago

Reinforcement Learning (RL) systems can be represented as Markov Decision Processes (MDPs), which are graph-based models of probabilistic behaviour. Typically, a logic over MDPs predicates only about temporal or epistemic properties of such systems, but fails to express properties about the learning behaviour that such systems may represent. In this project, the aim is to investigate extensions of temporal and epistemic logics to be able to express learning properties of RL systems consisting of multiple agents. In particular, the goal of this project is to define the syntax and semantics of such a logic and to develop optimal model checking decision procedures as well as an associated proof-of-concept tool implementation to evaluate the new logic-based reasoning framework.

Required knowledge

The student should have strong mathematical skills, and ideally some knowledge of computational complexity, automata theory, and basics of logic and proof.