Skip to main content

Explainability of Reinforcement Learning Policies for Human-Robot Interaction

Primary supervisor

Mor Vered

Co-supervisors

  • Dana Kulic
  • Leimin Tian

This PhD project will investigate the explainability of reinforcement learning (RL) policies in the context of human-robot interaction (HRI), aiming to bridge the gap between advanced RL decision-making and human trust, understanding, and collaboration. The research will critically evaluate and extend state-of-the-art explainability methods for RL, such as policy summarization, counterfactual reasoning, and interpretable model approximations, to make robot decision processes more transparent and intuitive. Through a series of user studies, the project will measure the impact of these explanations on appropriate trust, and performance, exploring how different forms of explanations shape the robot-user relationship. By combining technical advancements in RL explainability with empirical insights from human factors research, this work will contribute both novel algorithms and evidence-based design guidelines for developing socially aware and trustworthy autonomous robots.

Required knowledge

Candidates should hold a previous degree (Bachelor’s and/or Master’s) in Computer Science, Data Science, Robotics, Mechatronics, or Software Engineering, with demonstrated knowledge in machine learning, algorithms, and programming.

Prior exposure to reinforcement learning or human-robot interaction is highly desirable, though motivated candidates with a strong grounding in AI/ML and willingness to learn robotics or human-centered research methods will also be considered.

Experience with programming languages (particularly Python), deep learning frameworks, and robotic simulation platforms (ROS, Gazebo, Webots, or similar) will provide a significant advantage.


Learn more about minimum entry requirements.