Adaptive Weight Decay: Boosting Robustness and Performance On-the-Fly

Adaptive Weight Decay for Enhanced AI Performance

Adaptive weight decay is a new approach in artificial intelligence that automatically adjusts the hyper-parameter for weight decay during each training iteration. This method can lead to significant improvements in adversarial robustness without needing extra data, making it an attractive option for AI development.

Changing the weight decay hyper-parameter on the fly based on the strength of updates from the classification loss and the regularization loss can result in big improvements. For example, this simple modification can lead to a 20% relative robustness improvement for CIFAR-100 and a 10% relative robustness improvement on CIFAR-10, compared to the best tuned hyper-parameters of traditional weight decay.

Moreover, this method also has other benefits, such as being less sensitive to learning rate and resulting in smaller weight norms. These properties contribute to robustness to overfitting to label noise and pruning, making adaptive weight decay an exciting innovation in the field of AI.

Source link

Stay in the Loop

Get the daily email from AI Headliner that makes reading the news actually enjoyable. Join our mailing list to stay in the loop to stay informed, for free.

Latest stories

You might also like...