ConvLSTM2d能在pytorch上用吗 pytorch conv2d参数

转载

智能创新梦想家 2023-11-09 07:14:23

文章标签 PyTorch入门 pytorch 卷积核 2d 卷积 文章分类 PyTorch 人工智能

Pytorch中文文档中关于nn.Conv2d函数具体参数介绍：

class torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)

Parameters：

in_channels(int) – 输入信号的通道
out_channels(int) – 卷积产生的通道
kerner_size(int or tuple) - 卷积核的尺寸
stride(int or tuple, optional) - 卷积步长
padding(int or tuple, optional) - 输入的每一条边补充0的层数
dilation(int or tuple, optional) – 卷积核元素之间的间距
groups(int, optional) – 从输入通道到输出通道的阻塞连接数
bias(bool, optional) - 如果bias=True，添加偏置

import torch
import torch.nn as nn
import numpy

关于填充和步幅, padding可以通过添0扩展输入的矩阵，stride可以控制卷积核每次计算移动的距离，它两都可以控制输出的形状

用了nn的模型之后需要加上通道这个维度
之所以不用定义卷积核是因为卷积核的值在这里是随机的，可以通过parameter查看,parameter是个对象要转为list显示。设置值可以自己写卷积函数

corr = nn.Conv2d(in_channels=1,out_channels=1,kernel_size=3,padding=1)
X = X.view((1, 1) + X.shape)   #在使用nn自带的卷积函数时，这里要补上通道的维度
print(list(corr.parameters()))
print(X.shape)
print(corr)
print(corr(X))
print(list(corr.parameters()))

[Parameter containing:
tensor([[[[-0.0165, -0.2995, -0.0094],
          [-0.1117,  0.3317,  0.0723],
          [ 0.1026, -0.2529, -0.3191]]]], requires_grad=True), Parameter containing:
tensor([-0.0899], requires_grad=True)]
torch.Size([1, 1, 4, 4])
Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
tensor([[[[-0.2450, -0.3480, -0.3027, -0.1314],
          [-0.3507, -0.3348, -0.1942, -0.1996],
          [-0.3338, -0.2402, -0.5688, -0.2324],
          [-0.0597, -0.1412, -0.0888, -0.1264]]]],
       grad_fn=<MkldnnConvolutionBackward>)
[Parameter containing:
tensor([[[[-0.0165, -0.2995, -0.0094],
          [-0.1117,  0.3317,  0.0723],
          [ 0.1026, -0.2529, -0.3191]]]], requires_grad=True), Parameter containing:
tensor([-0.0899], requires_grad=True)]

多通道

通过in_channels和out_channels控制，这两个参数以及对输入的要求比较难理解，这里分析一下：

corr2 = nn.Conv2d(in_channels=2, out_channels=4, kernel_size=2, padding=0)
print(list(corr2.parameters())[0].shape)

torch.Size([4, 2, 2, 2])

上面这个是卷积核形状：

最后两个2，表示卷积核2x2。
第一个4对应输出四个通道，用多维的角度去想，即要输出4个通道，每个通道包括一组值，那么应该包括4组，这就是4的含义。
第二个2对应输入2个通道，即输入的每个通道都需哟1个2x2的卷积核去卷积，所以两个通道对应2。

X = torch.rand((2,1,3,3))
print(corr2(X))

---------------------------------------------------------------------------

RuntimeError                              Traceback (most recent call last)

<ipython-input-99-f698e441c87d> in <module>()
      1 X = torch.rand((2,1,3,3))
----> 2 print(corr2(X))


E:\soft2\annaconda\lib\site-packages\torch\nn\modules\module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)


E:\soft2\annaconda\lib\site-packages\torch\nn\modules\conv.py in forward(self, input)
    351 
    352     def forward(self, input):
--> 353         return self._conv_forward(input, self.weight)
    354 
    355 class Conv3d(_ConvNd):


E:\soft2\annaconda\lib\site-packages\torch\nn\modules\conv.py in _conv_forward(self, input, weight)
    348                             _pair(0), self.dilation, self.groups)
    349         return F.conv2d(input, weight, self.bias, self.stride,
--> 350                         self.padding, self.dilation, self.groups)
    351 
    352     def forward(self, input):


RuntimeError: Given groups=1, weight of size [4, 2, 2, 2], expected input[2, 1, 3, 3] to have 2 channels, but got 1 channels instead

这里报错是很明显的，因为输入的每个图是3x3，但是在输入通道是1，即每组只有一个特征图，而模型corr2设定的通道是2，需要提供两个特征图，改成下面的代码。

X = torch.rand((2,2,3,3))
print(corr2(X).shape)
print(corr2(X))

torch.Size([2, 4, 2, 2])
tensor([[[[-0.2524, -0.3684],
          [-0.3618,  0.0558]],

         [[ 0.4600,  0.6168],
          [ 0.4752,  0.6174]],

         [[ 0.0682,  0.3968],
          [ 0.1432,  0.3818]],

         [[ 0.0576,  0.4250],
          [ 0.2757,  0.6462]]],


        [[[-0.3360, -0.5825],
          [-0.4622, -0.0016]],

         [[ 0.3572,  0.4425],
          [ 0.4759,  0.4528]],

         [[ 0.2767,  0.4641],
          [ 0.2997,  0.4814]],

         [[ 0.1487,  0.1466],
          [ 0.2228,  0.3224]]]], grad_fn=<MkldnnConvolutionBackward>)