Primary supervisor
Enes MakalicResearch area
Machine LearningMinimum Message Length (MML) is an elegant information-theoretic framework for statistical inference and model selection developed by Chris Wallace and colleagues. The fundamental insight of MML is that both parameter estimation and model selection can be interpreted as problems of data compression. The principle is simple: if we can compress data, we have learned something about its underlying structure.
MML constructs a two-part message consisting of an assertion (describing the model and its parameters) and a detail (encoding the data given the model). The total message length, measured in bits or nats, balances model complexity against goodness-of-fit. It is essentially a formal implementation of Occam's razor. A key advantage of MML is that the message length provides a universal gauge for comparing models with entirely different structures and parameter counts, whether comparing linear regression against mixture models or decision trees.
MML is strictly Bayesian, requiring prior distributions for inference, yet differs from standard Bayesian approaches through its information-theoretic foundation. The MML87 approximation achieves computational tractability while remaining virtually identical to Strict MML for well-behaved models, and has been successfully applied to diverse problems including hypothesis testing, clustering, and machine learning.
Aim 1: Theoretical Investigation of MML Properties
Explore the asymptotic behavior of MML estimators, including consistency, convergence rates, and their connection to information geometry. Investigate how the optimal data space partitioning relates to Fisher-Rao geometry and develop theoretical results characterising MML estimation under various regularity conditions.
Aim 2: Development of Computational Methods for MML
Design and implement efficient algorithms for computing MML solutions beyond the one-dimensional case. Extend existing dynamic programming approaches to higher-dimensional problems or develop novel approximation methods that preserve the theoretical optimality properties of MML while remaining computationally tractable.
Aim 3: Applications to Modern Statistical Problems
Apply MML methodology to contemporary statistical challenges such as high-dimensional inference, Bayesian hypothesis testing with complex priors, or model selection in machine learning contexts. Compare MML-based approaches with modern alternatives (BIC, AIC, MDL, cross-validation) both theoretically and empirically across diverse datasets.