Skip to main content

Privacy-preserving Machine Learning

Primary supervisor

Shujie Cui

Co-supervisors

  • Giovanni Russello

Machine learning (ML) training and evaluation usually involve large-scale datasets and complicated computation. To process data efficiently, a promising solution is to outsource the processes to cloud platforms. However, traditional approaches of collecting users' data at cloud platforms are vulnerable to data breaches. Specifically, during the ML model training or inference service offering, the cloud server could learn the input data used to train the model, model structures, user queries and inference results, which may be sensitive to users or companies. 

Much research has been conducted to protect the privacy of outsourced ML tasks by employing cryptographic primitives, such as Secure Multi-Party Computation (SMC), Homomorphic Encryption (HE).  Nevertheless, SMC-based approaches do not scale well due to the large communication overhead. The efficiency of HE-based workloads is remarkably low, particularly when calculating non-linear functions. 

 

Student cohort

Single Semester
Double Semester

Aim/outline

Using trusted hardware, e.g., Intel SGX, is a potential way to accelerate ML tasks while ensuring data privacy. This project aims to propose an efficient hybrid framework for securely executing convolutional neural network (CNN) evaluation at cloud servers by combining HE and SGX.

The basic idea is to separate ML computations into HE-friendly functions (e.g., additions and matrix multiplications) and complex functions (e.g., if-else branching and matrix inverse) computed in cleartext inside the SGX enclave, and protect them with HE and SGX, respectively. 

 

Required knowledge

1. Basic knowledge of machine learning, especially the knowledge of CNN training and evaluation

2. Basic knowledge of cryptography and security

3. Basic knowledge of Intel SGX

4. Good at C/C++ programming