Primary supervisor
Leimin TianCo-supervisors
- Frits de Nijs
- Dr Pamela Carreno (Engineering)
Deep reinforcement learning (RL) has demonstrated promising performance when applied to human-robot interaction. In particular, previous studies have shown that a robot with such a model can learn social skills over time, for example, handshakes [1] or approaching a group of people adhering to social norms [2]. However, deep RL is subject to erroneous modelling of the task [3]. It is difficult to design an appropriate reward function that encourages the desired behaviors [4], similar to how animals and humans may develop “superstitious” behaviors in operant conditioning [5]. The goal of this project is to design a robot that learns a social norm using deep RL, and identify suitable reward functions that avoid making a “superstitious” robot.
Student cohort
Aim/outline
Basic goals:
- Developing a reinforcement learning model that incorporates emotions to enable adaptive agent behaviors.
- Extending the emotional RL model to include observations of the environment, such as task outcomes or user engagement.
Possible extensions:
- Applying the emotional RL model to a social robot
- Conduct human-robot interaction experiments to evaluate the emotional RL model
URLs/references
- [1] Qureshi, A.H., et al. 2016. Robot gains social intelligence through multimodal deep reinforcement learning. Humanoids 2016. (https://github.com/ahq1993/Multimodal-Deep-Q-Network-for-Social-Human-Robot-Interaction)
- [2] Gao, Y., et al. 2019. Learning socially appropriate robot approaching behavior toward groups using deep reinforcement learning. RO-MAN 2019. (https://github.com/usr-lab/pepper-social/tree/master)
- [3] Mongillo, G., et al. 2014. The misbehavior of reinforcement learning. Proceedings of the IEEE, 102(4), pp.528-541.
- [4] Irpan, A., 2018. Deep reinforcement learning doesn’t work yet. (https://www.alexirpan.com/2018/02/14/rl-hard.html)
- [5] Skinner, B.F., 1948. “Superstition” in the pigeon. Journal of experimental psychology, 38(2), p.168.
Required knowledge
Python programming, reinforcement learning