Federated Learning and Differential Privacy for Training Large Neural Network Language Models
Federated Learning (FL) is a method for training models using data stored on different devices, while Differential Privacy (DP) offers a formal privacy assurance for sensitive data. The aim is to train a large neural network language model (NNLM) on devices with constrained computational resources, while safeguarding privacy using FL and DP.
Partial Embedding Updates (PEU) for Reducing Noise
The addition of DP-noise to the model increases as the model size grows, often hindering convergence. To address this, a novel technique called Partial Embedding Updates (PEU) has been proposed to decrease noise by reducing the payload size, thus paving the way for more effective convergence.
Reducing Memory Demands with Low Rank Adaptation (LoRA) and Noise Contrastive Estimation (NCE)
Low Rank Adaptation (LoRA) and Noise Contrastive Estimation (NCE) have been adopted to mitigate the memory demands of large models on devices with limited computational capabilities. By combining these techniques, it becomes feasible to train large-vocabulary language models without compromising on accuracy and privacy.