Title: Unlocking High-Accuracy Training of Image Classification Models With Differential Privacy
A recent DeepMind paper discussing the ethical and social risks of language and image models has highlighted the potential leakage of sensitive information and privacy concerns. Organizations working on these models have a responsibility to address these risks. This article focuses on the use of differential privacy (DP), a privacy-enhancing technology, to mitigate privacy risks during the training of image classification models.
Privacy Risks in Training Data Leakage:
Language models and image classification models have been found to leak sensitive information about their training data. Malicious parties could exploit such leakage to reconstruct the training data from the model. These risks have raised concerns about data privacy and the need for effective privacy-enhancing technologies.
Differential Privacy: A Solution to Protect Privacy:
Differential privacy (DP) is a mathematical framework designed to protect individual records during statistical data analysis and machine learning model training. DP algorithms inject carefully calibrated noise during computations to prevent any inferences about unique features of individuals. DP algorithms provide robust privacy guarantees and have become a standard adopted by both public and private organizations.
Differential Private Stochastic Gradient Descent (DP-SGD):
DP-SGD is the most popular DP algorithm for deep learning. It is a modified version of standard stochastic gradient descent (SGD) that adds noise and clips gradients to ensure privacy protection. However, prior works have shown that DP-SGD often leads to significantly less accurate models, especially on larger neural network models used for challenging image classification tasks.
Improving DP Training for Image Classification Models:
Our research focuses on improving the accuracy of DP training on standard image classification benchmarks. We propose simple modifications to the training procedure and model architecture that yield a significant improvement in accuracy. We discovered that well-behaved gradients in DP-SGD enable efficient training of much deeper models than previously thought, potentially unlocking practical applications of image classification models with formal privacy guarantees.
Key Findings and Results:
Our research shows approximately a 10% improvement on CIFAR-10 compared to previous work when privately training without additional data. We also achieved a top-1 accuracy of 86.7% on ImageNet when privately fine-tuning a model pre-trained on a different dataset, approaching the performance of non-private models.
Standard Setting and Implementation:
Our results were achieved at ε=8, which is a standard setting for measuring the strength of privacy protection offered by differential privacy. The paper discusses this parameter in detail and presents additional experimental results on other values of ε and different datasets. To promote transparency and further research, we have open-sourced our implementation on GitHub for verification and collaboration.
By leveraging differential privacy and making crucial modifications to training procedures and model architectures, our research demonstrates a significant improvement in the accuracy of DP training for image classification models. These findings have the potential to enable practical applications of privacy-protected image classification models. Download our JAX implementation from GitHub to explore and contribute to this field of research.