Primary supervisor
Leimin TianCo-supervisors
- Dr Pamela Carreno (Engineering)
For a robot to communicate and interact with people, it requires the capability to understand audiovisual inputs, such as speech, gestures, and facial expressions, and to generate natural, timely, and expressive responses through these modalities. Current audiovisual sensing and generation functions of robots can benefit from leveraging advancement in deep learning based approaches reported in multimodal behavioural analysis research. In this project, your objective is to apply state of the art audiovisual sensing and generation models to improve the social communicative function of a Pepper robot.
Student cohort
Double Semester
Aim/outline
Basic goals:
- Developing multimodal recognition models for human emotions and behaviors
- Evaluating performance of the multimodal recognition models on existing datasets of human-robot interaction
- Applying the multimodal recognition models to the Pepper robot
Possible extensions:
- Conducting human-robot interaction experiments to evaluate performance of the multimodal recognition models
- Developing behavior generation models based on features identified in the recognition models
URLs/references
Required knowledge
Python programming, deep learning