Skip to main content

Enhancing Large Language Models in Automated Essay Scoring via Explainable AI

Primary supervisor

Guanliang Chen

In education, writing is a prevalent pedagogical practice employed by teachers and instructors to enhance student learning. Yet, the timely evaluation of students' essays or responses represents a formidable challenge, consuming considerable time and cognitive effort for educators. Recognizing the need to alleviate this burden, Automatic Essay Scoring (AES) has emerged, which refers to the process of using machine learning techniques to evaluate and assign scores to student-authored essays or responses. By automating this assessment process, educators can better focus on refining their teaching strategies, ultimately enabling a more efficient and effective learning experience for students.

Student cohort

Double Semester


This project aims to apply explainable AI techniques to enhance Large Language Models (e.g., BERT, LLaMA, and GPT-4) to accurately and reliably score student essays in education. For this project, two Kaggle datasets can be used, including the one for short answer scoring and the one for essay scoring. Specific tasks include: (1) fine-tune Large Language Models to perform the task of automated essay scoring; (2) evaluate existing prompting strategies to enable Large Language Models to perform the task of automated essay scoring; (3) evaluate existing prompting strategies to enable Large Language Models to provide reliable explanations about their assessments of student essays.

Required knowledge

  • Strong programming skills (e.g., Python)
  • Basic knowledge in Data Science, Natural Language Processing, Machine Learning, and Large Language Models
  • The following can be a plus: (i) good at academic writing; and (ii) strong motivation in pursing a quality academic publication.