Skip to main content

Human Spatio-temporal Action, Social Group and Activity Detection from Video

Primary supervisor

Hamid Rezatofighi

Human behaviour understanding in videos is a crucial task in autonomous driving cars, robot navigation and surveillance systems. In a real scene comprising of several actors, each human is performing one or more individual actions. Moreover, they generally form several social groups with potentially different social connections, e.g. contribution toward a common activity or goal. In this project, we tackle the problem of simultaneously grouping people by their social interactions, predicting their individual actions and the social activity of each social group, which we call the social task. Our goal is to propose a holistic approach that considers the multi-task nature of the problem, where these tasks are not independent and can benefit each other.

Student cohort

Double Semester

URLs/references

https://vl4ai.erc.monash.edu/research.html

https://arxiv.org/pdf/2007.02632.pdf

 

Required knowledge

  1. Good coding skills in a variety of coding languages
  2. Previous experience working with deep learning models for different tasks
  3. ​​​​​Proficient programming skills in Python and one of the main deep learning libraries (e.g., TensorFlow, PyTorch, Keras)