Skip to main content

Primary supervisor

Zachari Swiecki

Note that this project is available as an undergraduate winter scholarship project

To understand human behaviour, researchers often analyse discourse---records of what people say and do. One commonly studied type of discourse is language in the form of text. While there have been many advances in natural language processing (NLP), discourse analysis often still requires laborious human effort. For example, large amounts of text data still need to be annotated by humans in order to train NLP models. The advent of generative AI tools like ChatGPT offers a potential way to address this issue by offloading some of this effort. However, while the capabilities of generative for creating text are well documented, less is known about their ability to annotate text.

Student cohort

Single Semester

Aim/outline

Aim: Develop and test methods that use generative AI to annotate textual discourse. 

Steps: (1) Develop an analysis pipeline that reads in text, passes it to a generative AI, and outputs labels for the text; (2) compare generative AI labels to human labels to test accuracy; (3) repeat the approach in several domains to test generalisability. 

Required knowledge

Working knowledge or desire to learn:

Data science--e.g., data wrangling, NLP

Mathematics--e.g., linear algebra

Qualitative methods--e.g., discourse coding

Software engineering--e.g., package and GUI development