Skip to main content

Inference of chemical/biological networks: relational and structural learning

Primary supervisor

David Dowe

Co-supervisors


Expected outcomes: The student will learn inference and representation learning methods for network data. The knowledge can be easily used to analyse other networks, including but not limited to social networks, citation networks, and communication networks. A research publication in a refereed AI conference or journal is expected. A student taking this project should ideally have at least a reasonable background mathematical knowledge, including differential calculus (e.g., partial derivatives) and matrix determinants. The student should also know how to program with either Matlab, Java, or Python. Ideally, the student understands the process of data analysis, which includes data pre-processing, algorithms selection, and evaluation.

Student cohort

Double Semester

Aim/outline

Graphs or networks are effective tools to representing a variety of data in different domains. In the biological domain, chemical compounds can be represented as networks, with atoms as nodes and chemical bonds as edges. Analysis these networks are important as they may provide AI-based approaches for drug discovery. This project will focus on representing and inferring chemical or biological networks as a form of relational and structural learning. Given a network dataset, we wish to infer a model of the distribution of the elements of this data-set, possibly as a mixture of several distributions. We wish to represent the biological networks into proper formats, e.g., vector representations, so that existing machine learning algorithms (e.g., support vector machines) can readily be used for the prediction task, such as predicting the bioassay of a given chemical network.

URLs/references

 Comley, Joshua W. and D.L. Dowe (2003). General Bayesian Networks and Asymmetric Languages, Proc. 2nd Hawaii International Conference on Statistics and Related Fields, 5-8 June, 2003

 Comley, Joshua W. and D.L. Dowe (2005). ``Minimum Message Length and Generalized Bayesian Nets with Asymmetric Languages'', Chapter 11 (pp265-294) in P. Gru:nwald, I. J. Myung and M. A. Pitt (eds.), Advances in Minimum Description Length: Theory and Applications, M.I.T. Press (MIT Press), April 2005, ISBN 0-262-07262-9. [Final camera ready copy was submitted in October 2003.]

 David L. Dowe and Nayyar A. Zaidi (2010), "Database Normalization as a By-product of Minimum Message Length Inference", Proc. 23rd Australian Joint Conference on Artificial Intelligence (AI'2010) [Springer Lecture Notes in Artificial Intelligence (LNAI), vol. 6464], Adelaide, Australia, 7-10 December 2010, Springer, pp82-91

  Shirui Pan, Ruiqi Hu, Guodong Long, Jing Jiang, Lina Yao, Chengqi Zhang: Adversarially Regularized Graph Autoencoder for Graph Embedding. IJCAI 2018: 2609-2615 

 Shirui Pan, Jia Wu, Xingquan Zhu, Guodong Long, Chengqi Zhang: Finding the best not the most: regularized loss minimization subgraph selection for graph classification. Pattern Recognition 48(11): 3783-3796 (2015)

 Shirui Pan, Jia Wu, Xingquan Zhu, Chengqi Zhang, Yang Wang: Tri-Party Deep Network Representation. IJCAI 2016: 1895-1901

  G. Visser, P. E. R. Dale, D. L. Dowe, E. Ndoen, M. B. Dale and N. Sipe (2012), "A novel approach for modeling malaria incidence using complex categorical household data: The minimum message length (MML) method applied to Indonesian data", Computational Ecology and Software, 2012, 2(3):140-159

  Wallace, C.S. (2005), Statistical and Inductive Inference by Minimum Message Length, Springer.

 Wallace, C.S. and D.L. Dowe (1999a). Minimum Message Length and Kolmogorov Complexity, Computer Journal , Vol. 42, No. 4, pp270-283

 Wallace, C.S. and D.L. Dowe (2000). MML clustering of multi-state, Poisson, von Mises circular and Gaussian distributions, Statistics and Computing, Vol. 10, No. 1, Jan. 2000, pp73-83

Required knowledge

The student should ideally have at least a reasonable background mathematical knowledge, including differential calculus (e.g., partial derivatives) and matrix determinants. The student should also know how to program with either Matlab, Java, or Python. Ideally, the student understands the process of data analysis, which includes data pre-processing, algorithms selection, and evaluation.