Skip to main content

Human-in-loop machine learning

Primary supervisor

Lan Du


Project description: 

Nowadays, data-driven machine learning algorithms are well suited to solve real-world problems that require high-level prediction accuracy. However, it seems as if nothing beats the simple equation that more training data = better performance.  Learningin particular, the advanced deep learning methods, like BERT for NLP and ResNet for image processingoften require thousands or millions of data samples to converge to satisfactory performance, known as data hunger.  A fundamental problem is that acquiring a huge amount of high-quality data is laborious, time-consuming, and also requires substantial human expertise. Conversely, in real-world scenarios and after just a few data samples, humans are able to quickly uncover the underlying pattern of apparently patternless data and to make accurate predictions or decisions with rationales, i.e., the reasons leading to the predictions/decision. Human-in-the-Loop Machine Learning  [1] bridges the gap by incorporating human interaction in the learning process to either improve algorithm performance or to complement the information provided by the data. It is a practical guide to optimizing the machine learning process, including techniques for annotation, active learning (based on either deep learning or Bayesian learning), semi-supervised learning, transfer learning, imitation learning, etc., aiming to ensure the data and models are correct, relevant, and cost-effective.


Project aim:

This research investigates how to integrate human interaction modalities (e.g., human evaluation and demonstrations) into machine learning in order to achieve what neither a human being nor a machine can achieve on their own. The aim of this research is to develop cutting-edge Human-in-the-Loop Machine Learning algorithms that are able to avoid bias, augment rare data, maintain human-level precision, incorporate subject matter expertise while ensuring consistency & accuracy and providing transparency & accountability, which have significant cross-discipline implications, like in public health that we are particularly interested in.


Required knowledge

  • Proficiency in Python programming, with experience in Pytorch/Tensorflow
  • Basic knowledge of machine learning

Project funding


Learn more about minimum entry requirements.