Efficient and Effective Parameter-free Pruning for Optimal Model Performance

DNN Pruning: An Efficient and Effective Approach to Model Size Reduction

DNN pruning is a widely used method to reduce the size of a model, improve inference speed, and minimize power consumption on DNN accelerators. However, existing approaches can be complex, expensive, and ineffective for different vision/language tasks and DNN architectures. They may also fail to meet structured pruning constraints. In this article, we introduce an innovative train-time pruning scheme called Parameter-free Differentiable Pruning (PDP), which delivers exceptional results in terms of model size, accuracy, and training cost.

What is PDP?

PDP is a pruning technique that generates soft pruning masks for model weights in a parameter-free manner, based on a dynamic function of weights during training. PDP is differentiable and offers a simple and efficient solution for various vision and natural language tasks. It achieves state-of-the-art results for random, structured, and channel pruning on different DNN architectures.

Outstanding Results Achieved

PDP has demonstrated impressive performance on various tasks. For instance, when applied to MobileNet-v1, PDP achieves 68.2% top-1 ImageNet1k accuracy with 86.6% sparsity. This accuracy is 1.7% higher than the accuracy achieved by state-of-the-art algorithms. Additionally, PDP achieves over 83.1% accuracy on Multi-Genre Natural Language Inference with 90% sparsity for BERT, outperforming existing techniques which only achieve 81.5% accuracy.

PDP’s effectiveness also extends to structured pruning. In the case of 1:4 structured pruning of ResNet18, PDP improves the top-1 ImageNet1K accuracy by more than 3.6% compared to the current best approach. Similarly, for channel pruning of ResNet50, PDP only reduces the top-1 ImageNet1K accuracy by 0.6%, a slight improvement over the state-of-the-art.

With its impressive results and broad applicability, PDP offers a promising solution for DNN model pruning. It effectively reduces model size, improves accuracy, and lowers training costs, making it an essential tool for AI practitioners and researchers.

Source link

Stay in the Loop

Get the daily email from AI Headliner that makes reading the news actually enjoyable. Join our mailing list to stay in the loop to stay informed, for free.

Latest stories

You might also like...