Skip to main content

Fair Knowledge Tracing in Second Language Acquisition

Primary supervisor

Guanliang Chen

Knowledge tracing plays a crucial role in providing adaptive learning to students: by estimating a student’s current knowledge state and predicting her performance in future interactions, students can receive personalized learning materials (e.g. on the topics the student is estimated to know the least about). Over the years, various knowledge tracing techniques have been proposed and studied, including Bayesian Knowledge Tracing, Performance Factor Analysis, Learning Factors Analysis, and Deep Knowledge Tracing. Notable is that most of the existing works focus on learning performance within mathematics in elementary school and high school due to the availability of sufficiently large datasets in this domain. The generalization to other learning scenarios and domains remains under-explored. Particularly, there are few studies attempted to explore knowledge tracing in the setting of SecondLanguage Acquisition.Recent studies showed that SLA is becoming increasingly important in people’s daily lives and should gain more research attention to facilitate their learning process.

Student cohort

Double Semester


This project aims to make use of the dataset shared in the Duolingo Shared Task on Second Language Acquisition Modeling to investigate how state-of-the-art pre-trained language models such as BERT can be used to model knowledge status of students. Specifically, we will use the model developed in (Srivastava, M., & Goodman, N., 2021) as our testbed to investigate: (1) to what extent a student's historical performance records should be used for the current prediction; and (2) to what extent such a BERT-based knowledge tracing model displays algorithmic bias towards students of different demographic attributes (e.g., students from developed countries vs. those from developing countries).


  • Settles, B., Brust, C., Gustafson, E., Hagiwara, M., & Madnani, N. (2018, June). Second language acquisition modeling. In Proceedings of the thirteenth workshop on innovative use of NLP for building educational applications (pp. 56-65).
  • Srivastava, M., & Goodman, N. (2021). Question Generation for Adaptive Education. arXiv preprint arXiv:2106.04262.

Required knowledge

  • Strong programming skills (e.g., Python)
  • Basic knowledge in Data Science, Natural Language Processing, and Machine Learning
  • The following can be a plus: (i) prior experience in applying Deep Learning models; (ii) good at academic writing; and (iii) strong motivation in pursing a quality academic publication.