Primary supervisor
Russell TsuchidaCo-supervisors
Pretrained models
The hidden layers of pretrained foundation models, such as ChatGPT, contain useful and abstract summaries of data. From an information-theoretic perspective, they might compress the data. From a machine learning perspective, they compute useful features of the data. From a statistics perspective, they might be sufficient statistics for a parameter of interest.
Probabilistic models
Among other things, probabilistic models can quantitatively describe the long-term frequencies of events (such as words or sentences) or beliefs about events. A classical approach for building such models is to apply the exponential function to a linear transformation of the statistic, and normalise the result.
Student cohort
Aim/outline
This project will use the statistics in pretrained foundational models to define classes of probability distributions. A key computational consideration is how the normalising constant will be computed. The exact outline of the project may differ depending on the student, but might look like something as follows
- Familiarisation with probabilistic modelling and exponential families
- Mathematical formulation of probabilistic modelling using pretrained models
- Download and installation of software packages for pretrained models
- Writing software which fits new data to probabilistic models
- Performing numerical experiments using the new software
- Writing the methods and results in a report
Required knowledge
The student should have strong Python knowledge, and ideally some experience in using PyTorch.
This project will involve downloading and using pretrained models, for example https://ollama.com/ or https://github.com/evo-design/evo.
The student will be required to follow some derivations concerning probabilistic machine learning methods, to the standard of an HD undergraduate student.