[NextGen] Secure and Privacy-Enhancing Federated Learning: Algorithms, Framework, and Applications to NLP and Medical AI

Co-supervisors

Research area

Cybersecurity

Federated learning (FL) is an emerging machine learning paradium to enable distributed clients (e.g., mobile devices) to jointly train a machine learning model without pooling their raw data into a centralised server. Because data never leaves from user clients, FL systematically mitigates privacy risks from centralised machine learning and naturally comply with rigorous data privacy regulations, such as GDPR and Privacy Act 1988.

FL has drawn significant attention recently from both research communities and industry sectors, and has been applied in real-world applications, e.g., keyboard suggestions [1] and emoji prediction [2]. Despite its great potential, FL faces critical security, privacy, and performance challenges under practical constraints and considerations. For example, poisioning attacks against FL can heavily impact model performance and cause model misclassification. In medical applications, such attacks might result in miss-diagnosis, lead to incorrect treatment and even death. Another noteworthy example is gradient leakage attack, where the orignial client data can be recovered via model gradients submitted from clients to the server. Such an attack directly breaks the claimed privacy guarantees of FL.

In this project, we aim at an ambitious goal - designing secure and privacy-enhancing algorithms and framework for FL and applying our designs into real-world applications. To achieve this goal, we expect to deeply and comprehensively explore the following directions:

1) Investigate privacy-enhancing techniques (e.g., differential privacy and/or secure multi-party computation) and design protocols to secure model updates throughout the process of FL collaborative training;

2) Investigate and design security measures (e.g., robust aggregation rules) to defeat model poisioning attacks against FL;

3) Design and optimise task-specific FL algorithms for NLP and Medical AI;

4) Design and implement a user-friendly FL framework for both research and technology adoption purposes.

We expect to recruit a cohort of students with diverse background and skill sets in machine learning, cybersecurity, and digital health to work in the above directions in a collaborative manner. This project will provide multiple competitive scholarships funded by a national elite HDR training program: Data61 Next Generation Graduate (https://www.csiro.au/en/work-with-us/funding-programs/programs/next-generation-graduates-programs), and under a project titled as “Privacy-preserving Machine Learning: Technology Develeopment and Adoption” led by Monash Faculty of IT.

Reference:

[1] ”Federated Learning: Collaborative Machine Learning without Centralized Training Data”, https://ai.googleblog.com/2017/04/federated-learning-collaborative.html

[2] “Learning with Privacy at Scale”, https://machinelearning.apple.com/research/learning-with-privacy-at-scale

[3] Fang et al., "Local Model Poisoning Attacks to Byzantine-Robust Federated Learning". In USENIX Security Symposium, 2020.

[4] Zhu et al., "Deep Leakage from Gradients." Advances in Neural Information Processing Systems, 2019.

Required knowledge

Eligibility and Requirements:

Domestic students (Australian PR or Citizen) for PhD, Masters, and Honors;
Meet the entry requirements of Monash for PhD (https://www.monash.edu/graduate-research/future-students/apply);
Knowledge of machine learning (would be a plus if familiar with NLP and Medical AI);
Knowledge of cybersecurity (would be a plus if good at math and basic cryptography);
Programming skills in Python (would be a plus if familiar with C++, ML/FL frameworks).

Benefits of PhD students:

Receive world-class research training at Monash FIT under a reputed multidisciplinary team, including Monash Cybersecurity Group, Monash ML&VL group, and Monash Medical AI Group;
Work with a cohort of talented students to address real-world problems and challenges;
Have a chance to work with lead industry partners (eBay, Ansen Innovation, Eyetelligence, Data61) and secure positions after graduation;
Access to the resources at Australian National Research Agency - CSIRO Data61; the distributed security systems group of Data61 will provide support and co-supervision;
AI-focused course work and keynotes delivered by Data61, e.g., ‘AI 101’, ‘Ethics in Technology’, ‘Entrepreneurship in Technology’, ‘Data-Centric Engineering’ and ‘Data and Decisions’.
Stipend rate: $40500 per annum; Training allowance: $5000 per annum; Travel: $5000; Thesis allowance: $840.

Benefits of Master (minor thesis, course work) and honour students:

Receive world-class research training at Monash FIT under a reputed multidisciplinary team, including Monash Cybersecurity Group, Monash ML&VL group, and Monash Medical AI Group;
Work with a cohort of talented students to address real-world problems and challenges;
Have a chance to work with lead industry partners (eBay, Ansen Innovation, Eyetelligence, Data61) and secure positions after graduation;
Stipend rate: $30,000 for one year (master by minor thesis); $15,000 for one year (master by course work and honour students).

Lead Supervision Team:

Xingliang Yuan (Cybersecurity, Privacy-enhancing Techniques, Machine Learning Systems): xingliang.yuan.monash.edu
Reza Haffari (Machine Learning, NLP): gholamreza.haffari@monash.edu
Zongyuan Ge (Medical AI, Medcial Imaging, Biomedical Engineering): zongyuan.ge@monash.edu
Lizhen Qu (NLP, Deep Learning):lizhen.qu@monash.edu

Project funding

Other

Funding reference

https://www.csiro.au/en/work-with-us/funding-programs/programs/Next-Generation-Graduates-Programs/R1-Emerging-Technologies

[NextGen] Secure and Privacy-Enhancing Federated Learning: Algorithms, Framework, and Applications to NLP and Medical AI

Co-supervisors

Research area

Required knowledge

Project funding

Funding reference

Primary supervisor

Supervisor Connect

Browse

Recently added