REVEALING PRIVACY RISKS IN PRIVACY-PRESERVING MACHINE LEARNING: A STUDY OF DATA INFERENCE ATTACKS
LUO XINJIAN
LUO XINJIAN
Citations
Altmetric:
Alternative Title
Abstract
Modern machine learning (ML) models require large datasets, but obtaining such data while ensuring privacy is challenging, especially in sensitive fields like healthcare and finance. This thesis examines privacy-preserving methods in ML by designing data inference attacks to identify privacy risks.
We show that the data preprocessing method InstaHide leaks private information, allowing original images to be reconstructed. In vertical federated learning (VFL), despite no direct data sharing, federated models can expose private data. Shapley value-based model explanations can reveal private inputs from the model explanation reports. Generative model sharing can be compromised by both the data sharer and receiver through poisoning and inference attacks.
To mitigate these risks, we suggest using regularization, differential privacy, and homomorphic encryption to protect intermediate outputs and model parameters. Our findings provide insights for designing robust, privacy-preserving ML applications.
Keywords
Machine Learning, Attacks, Privacy Risks, Federated Learning, Model Explanations, Model Sharing
Source Title
Publisher
Series/Report No.
Collections
Rights
Date
2024-05-15
DOI
Type
Thesis