Skip to main content

Primary supervisor

Russell Tsuchida

Co-supervisors


Pretrained models

The hidden layers of pretrained foundation models, such as ChatGPT, contain useful and abstract summaries of data. From an information-theoretic perspective, they might compress the data. From a machine learning perspective, they compute useful features of the data. From a statistics perspective, they might be sufficient statistics for a parameter of interest.

Probabilistic models

Among other things, probabilistic models can quantitatively describe the long-term frequencies of events (such as words or sentences) or beliefs about events. A classical approach for building such models is to apply the exponential function to a linear transformation of the statistic, and normalise the result.

Student cohort

Single Semester
Double Semester

Aim/outline

This project will use the statistics in pretrained foundational models to define classes of probability distributions. A key computational consideration is how the normalising constant will be computed. The exact outline of the project may differ depending on the student, but might look like something as follows

  • Familiarisation with probabilistic modelling and exponential families
  • Mathematical formulation of probabilistic modelling using pretrained models
  • Download and installation of software packages for pretrained models
  • Writing software which fits new data to probabilistic models
  • Performing numerical experiments using the new software
  • Writing the methods and results in a report

Required knowledge

The student should have strong Python knowledge, and ideally some experience in using PyTorch.

This project will involve downloading and using pretrained models, for example https://ollama.com/ or https://github.com/evo-design/evo.

The student will be required to follow some derivations concerning probabilistic machine learning methods, to the standard of an HD undergraduate student.