A Survey of Model Compression and Acceleration for Deep Neural Networks

I. Introduction

中文筆記

Table 整理

Theme Name Parameter pruning and sharing Low-rank factorization Transfered/compact convolutional filters Knowledge distillation
Description 減少冗餘參數 使用矩陣/向量分解來評估 informative parameters 設計特殊結構的 Conv Filters 以儲存參數 利用更大的 model 抽取出的 knowledge 訓練一個 compact 的 NN
Applications Conv & FC layer Conv & FC layer Conv layer only Conv & FC layer

II. Parameter pruning and sharing

論文 Compressing deep convolutional networks using vector quantization 顯示 network pruning 對減少 NN複雜度還有處理 overfitting 非常有效

A. Quantization and Binarization

B. Pruning and Sharing

C. Designing Structural Matrix

III. Low-rank factorization

IV. transfered/compact convolutional filters

V. knowledge distillation

results matching ""

    No results matching ""