ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression

1. 架构

在给定预先训练的模型的情况下,将以预定义的压缩率逐层修剪该模型。框架总体如下:

ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_迭代

  1. 滤波器选择。如果我们能用第ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_三元组_02层输入中的一个通道子集来近似第ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_迭代_03层的输出,可以将ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_三元组_02层的其它通道(补集)剔除。同时,第ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_三元组_02层的输入feature map的一个通道是由第ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_三元组_06层中的一个滤波器产生的,因此我们可以将第ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_三元组_06层对应ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_迭代_03层输入所要剔除的通道的滤波器剪裁。
  2. 裁枝。我们将第ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_迭代_03层输入的弱信道(可有可无,无关紧要的通道)及其在第ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_三元组_06中的相应滤波器修剪掉,从而导致产生小得多的模型。修剪后的网络具有完全相同的结构,但具有更少的过滤器和通道。
  3. 微调。微调是恢复滤波因修剪而破坏泛化能力的必要步骤。但对于大数据集和复杂模型来说,这将需要很长的时间。出于节省时间的考虑,我们在修剪一层之后微调一个或两个epoch。为了得到准确的模型,当所有图层都已修剪完毕后,将执行更多的epoch。
  4. 迭代到步骤1以修剪下一层。

分析

设置一个三元组ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_迭代_11其中ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_三元组_12为第ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_ide_13输入的特征图,其中ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_三元组_14ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_迭代_15ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_迭代_16代表ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_迭代_17个滤波器。

ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_三元组_18


如图2所示我们在输出的特征图上任选择一个位置,其数值记为ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_ide_19,所以

ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_ide_20

令任意通道值为:

ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_ide_21

所以我们可以得

ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_三元组_22

其中ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_ide_23ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_ide_24,如果我们选择子集通道数ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_迭代_25总是可以实现:

ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_ide_26

我们就可以实现模型的压缩。具体的实现方法是采用贪心算法得出最优的集合ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_迭代_27,给定ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_三元组_28个训练实例ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_三元组_29,则最优问题为:

ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_三元组_30

这里面ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_三元组_31是选择通道的数量,ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_ide_32表示压缩率,我们可以优化问题转化为:

ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_迭代_33

其中ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_ide_34表示未选取的通道集合。作者也采用了其它算法进行求解,但是简单的贪婪方法具有更好的性能和更快的速度。

前面介绍的都是关于算法的细节,那么如何应用到现有的网络结构中呢?这里作者主要介绍VGG-16和ResNet-50。对于VGG-16网络,由于前面10层卷积占据了90%的计算量,而全连接层又占据了86%的参数,因此作者采用对前面10层卷积层进行prune,达到加速目的,另外将所有全连接层用一个global average pooling层代替。对于残差网络,由于结构的特殊性,作者只是修剪了残差块的前两层。

ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_三元组_35

实验结果

ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression_三元组_36