pytorch的linear和Conv2d的区别 pytorch conv2d参数

转载

mob6454cc634aa4 2024-06-25 04:16:33

pytorch conv2d参数讲解

"""
	Args:
        in_channels (int): Number of channels in the input image
        out_channels (int): Number of channels produced by the convolution
        kernel_size (int or tuple): Size of the convolving kernel
        stride (int or tuple, optional): Stride of the convolution. Default: 1
        padding (int or tuple, optional): Zero-padding added to both sides of the input. Default: 0
        padding_mode (string, optional). Accepted values `zeros` and `circular` Default: `zeros`
        dilation (int or tuple, optional): Spacing between kernel elements. Default: 1
      groups (int, optional): Number of blocked connections from input channels to output channels. Default: 1
        bias (bool, optional): If ``True``, adds a learnable bias to the output. Default: ``True``
        """
    def __init__(self, in_channels, out_channels, kernel_size, stride=1,
                 padding=0, dilation=1, groups=1,
                 bias=True, padding_mode='zeros')

1、 in_channels
输入维度

2、out_channels
输出维度

3、kernel_size
卷积核大小

4、stride
步长大小

5、padding
补0

6、dilation

kernel间距

如果我们设置的dilation=0的话，效果如图：

蓝色为输入，绿色为输出，可见卷积核为3*3的卷积核

pytorch的linear和Conv2d的区别 pytorch conv2d参数_卷积核

如果我们设置的是dilation=1，那么效果如图：

蓝色为输入，绿色为输出，卷积核仍为3*3，但是这里卷积核点与输入之间距离为1的值相乘来得到输出

pytorch的linear和Conv2d的区别 pytorch conv2d参数_卷积_02

好处：

这样单次计算时覆盖的面积（即感受域）由dilation=0时的33=9变为了dilation=1时的55=25

在增加了感受域的同时却没有增加计算量，保留了更多的细节信息，对图像还原的精度有明显的提升

7 、groups
①

Convolution 层的参数中有一个group参数，其意思是将对应的输入通道与输出通道数进行分组, 默认值为1, 也就是说默认输出输入的所有通道各为一组。比如输入数据大小为90x100x100x32，通道数32，要经过一个3x3x48的卷积，group默认是1，就是全连接的卷积层。

如果group是2，那么对应要将输入的32个通道分成2个16的通道，将输出的48个通道分成2个24的通道。对输出的2个24的通道，第一个24通道与输入的第一个16通道进行全卷积，第二个24通道与输入的第二个16通道进行全卷积。

极端情况下，输入输出通道数相同，比如为24，group大小也为24，那么每个输出卷积核，只与输入的对应的通道进行卷积。

②

比如input_size = [1,6,1,1], 如果你令conv = nn.Conv2d(in_channels=6, out_channels=6, kernel_size=1, stride=1, dilation: 空洞卷积; padding=0, groups=?, bias=False)，则当groups=1时，即为默认的卷积层，则conv.weight.data.size为[6,6,1,1],实际上共有6 * 6=36个参数；若group=3时，则每组计算只有out_channel/groups = 2个channel参与，故每一组卷积层的参数大小为[6,2,1,1]，每一组共有6 * 2=12个参数，相当于每一组被重复用了3次（即group）次，最后再concat.

③

groups 决定了将原输入分为几组，而每组channel重用几次，由out_channels/groups计算得到，这也说明了为什么需要 groups能供被 out_channels与in_channels整除。

8、bias
卷积之后，如果要接BN操作，最好是不设置偏置，因为不起作用，而且占显卡内存。

输出shape：
N = (W-F+2P)/S +1
N ： output_shape 为 N x N
W ： input_shape 为 W×W
F ： Filter 大小 F×F
P ： Padding 大小
S ：步长 stride

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。