Skip to main content

Graph Learning for Out of Distribution Data

Primary supervisor

Teresa Wang

Many real-world data can be naturally represented as graphs including biological networks, molecular graphs, academic networks, and knowledge graphs. Effective modelling graph data is beneficial to many real-world tasks. Most real-world graphs include temporal information like biological networks, academic networks and knowledge
graphs.

However, most benchmark studies in this area follow the convention of using random splits to generate training/test sets, which is not realistic or useful for real-world applications and generally leads to overly optimistic performance results. Thus, there is an urgent need for a study on how to effectively model graphs to achieve good generalization under the realistic data split scenarios where most data in the test set are out-of-distribution (i.e., having never been seen or seldom been seen in the training set). For example, the random splits are not realistic in many practical applications such as friend recommendation in social networks, in which test edges (friend relations after a certain timestamp) naturally follow a different distribution from training edges (friend relations before a certain timestamp).

 

Required knowledge

foundations on machine learning;

excellent Python programming skills;

a preference on further PhD study is preferred