pytorch 中文教程 pytorch入门教程(非常详细)

转载

mob64ca141139a2 2023-10-16 14:57:39

文章标签 pytorch 中文教程 pytorch 深度学习 python 数据 文章分类 PyTorch 人工智能

文章目录

Best
前言：什么是PyTorch?

pytorch
文档

一.构建tensor
二.tensor运算
三.numpy和pytorch的tensor之间进行转换
四.使用GPU

4.1 tensor转移到gpu的两种方法

五.两层神经网络简单练习

5.1 pytorch求梯度

5.1.1 手动求梯度
5.1.2 pytorch自动算gradient

5.2 使用pytorch的nn(neural network)库
5.3 使用优化器更新参数(optim)
5.4 使用类的方法写模型(标准写法)

六写神经网络过程总结

Best

最好是看教程：https://pytorch.org/tutorials/beginner/basics/intro.html

pytorch 中文教程 pytorch入门教程(非常详细)_python

前言：什么是PyTorch?

pytorch

PyTorch是一个基于Python的科学计算库，它有以下特点:

基本数据tensor使用类似于NumPy的ndarray，但它可以被pytorch支持去进行计算，比如梯度
可以用它定义深度学习模型，可以灵活地进行深度学习模型的训练和使用

文档

中文官方

一.构建tensor

import torch

构建一维的，以ones为例

x=torch.ones(3)
x

tensor([1., 1., 1.])

用empty构建一个未初始化的矩阵：

x=torch.empty(5,3)
x

tensor([[9.2755e-39, 1.0561e-38, 9.1837e-39],
        [1.0653e-38, 4.2246e-39, 1.0286e-38],
        [1.0653e-38, 1.0194e-38, 8.4490e-39],
        [1.0469e-38, 9.3674e-39, 9.9184e-39],
        [8.7245e-39, 9.2755e-39, 8.9082e-39]])

用rand构建一个随机初始化的矩阵

x=torch.rand(5,3)
x

tensor([[0.4299, 0.3554, 0.6924],
        [0.0979, 0.2094, 0.3660],
        [1.0000, 0.4666, 0.8967],
        [0.9141, 0.9506, 0.1320],
        [0.3213, 0.7118, 0.3338]])

用zeros构建一个全部为0的，类型为long的矩阵

x=torch.zeros(5,3)
print(x.dtype)
x

torch.float32





tensor([[0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.],
        [0., 0., 0.]])

构建一个类型为long的矩阵

x=torch.zeros(5,3,dtype=torch.long)
x.dtype

torch.int64

从数据里直接构造tensor

x=torch.tensor([[5.2,3],[2,4],[0,1]])
x

tensor([[5.2000, 3.0000],
        [2.0000, 4.0000],
        [0.0000, 1.0000]])

根据已有的tensor构建一个tensor，新tensor会重用原来的tensor的特征（除非提供新的），比如数据类型

y=x.new_ones(5,3)
print(y.dtype,x.dtype)
y

torch.float32 torch.float32





tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])

用randn_like产生一个和原有tensor形状一样，但数字是随即产生的矩阵

x=torch.tensor([[5,3],[2,4],[0,1]])
print(x)
#此处使用randn_like好像不支持long类型，因此得指定dtype=torch.float
y=torch.randn_like(x,dtype=torch.float)
y

tensor([[5, 3],
        [2, 4],
        [0, 1]])





tensor([[ 0.0310,  1.4271],
        [-0.9848,  0.4927],
        [ 0.3195,  0.9557]])

得到tensor的形状

# 1
print(x.shape)
# 2
print(x.size())

torch.Size([3, 2])
torch.Size([3, 2])

二.tensor运算

加法运算

x=torch.rand(3,2)
print(x)
y=torch.rand(3,2)
print(y)

tensor([[0.3739, 0.4920],
        [0.5229, 0.3832],
        [0.3701, 0.8966]])
tensor([[0.0254, 0.2324],
        [0.1292, 0.2256],
        [0.9176, 0.8742]])

# 加法一
print("加法一:")
x+y

加法一:





tensor([[0.3994, 0.7243],
        [0.6522, 0.6088],
        [1.2877, 1.7708]])

# 加法二
print("加法二:")
torch.add(x,y)

加法二:





tensor([[0.3994, 0.7243],
        [0.6522, 0.6088],
        [1.2877, 1.7708]])

in-place加法（任何in-place运算都会以 _ 结尾，且in-place运算会改变外边值）

y=torch.rand(3,2)
x=torch.rand(3,2)
print(y)
y.add(x)#没加下划线，不会改变y的值(不过好像没加下划线这句话没啥意义)
print(y)
y.add_(x)#加了下划线，会改变y的值
print(y)

tensor([[0.5951, 0.3312],
        [0.9888, 0.9005],
        [0.0621, 0.2813]])
tensor([[0.5951, 0.3312],
        [0.9888, 0.9005],
        [0.0621, 0.2813]])
tensor([[1.3827, 0.8883],
        [1.4221, 1.8106],
        [0.5017, 0.8420]])

类似Numpy的indexing操作可以在pytorch的tensor上运算

y[1:,1:]

tensor([[1.8106],
        [0.8420]])

改变tensor的形状

x=torch.randn(4,4)
x

tensor([[-1.1561, -0.6490,  2.2239, -1.1019],
        [-0.8052,  1.4307, -0.0130, -0.1001],
        [-1.4327, -0.6575,  0.8889, -0.9387],
        [ 1.9800,  1.1328,  1.7042,  0.2872]])

#改变成1维的
y=x.view(16)
print(y)
#改成1行16列，但还是二维的
y=x.view(1,16)
print(y)
#用-1自动补全，比如2行8列
y=x.view(-1,8)
print(y)
y=x.view(2,-1)
print(y)

tensor([-1.1561, -0.6490,  2.2239, -1.1019, -0.8052,  1.4307, -0.0130, -0.1001,
        -1.4327, -0.6575,  0.8889, -0.9387,  1.9800,  1.1328,  1.7042,  0.2872])
tensor([[-1.1561, -0.6490,  2.2239, -1.1019, -0.8052,  1.4307, -0.0130, -0.1001,
         -1.4327, -0.6575,  0.8889, -0.9387,  1.9800,  1.1328,  1.7042,  0.2872]])
tensor([[-1.1561, -0.6490,  2.2239, -1.1019, -0.8052,  1.4307, -0.0130, -0.1001],
        [-1.4327, -0.6575,  0.8889, -0.9387,  1.9800,  1.1328,  1.7042,  0.2872]])
tensor([[-1.1561, -0.6490,  2.2239, -1.1019, -0.8052,  1.4307, -0.0130, -0.1001],
        [-1.4327, -0.6575,  0.8889, -0.9387,  1.9800,  1.1328,  1.7042,  0.2872]])

tensor只有一个元素时，用.item()可以把里面的value变成python数值

x=torch.randn(1)
x

tensor([2.0330])

x=x.item()
x

2.032999038696289

转置

x=torch.rand(2,3)
x

tensor([[0.8038, 0.9835, 0.5112],
        [0.6364, 0.4693, 0.9986]])

x=x.transpose(1,0)
x

tensor([[0.8038, 0.6364],
        [0.9835, 0.4693],
        [0.5112, 0.9986]])

更多阅读

各种Tensor operations, 包括transposing, indexing, slicing,
mathematical operations, linear algebra, random numbers在
<https://pytorch.org/docs/torch>.

三.numpy和pytorch的tensor之间进行转换

pytorch的tensor转为numpy的array

x=torch.ones(3)
x

tensor([1., 1., 1.])

y=x.numpy()
y

array([1., 1., 1.], dtype=float32)

注意，上面pytorch的tensor和numpy的array是共享内存空间的

y[1]=2
y

array([1., 2., 1.], dtype=float32)

tensor([1., 2., 1.])

numpy的array转为pytorch的tensor

import numpy as np
x=np.ones(3)
print(x)
y=torch.from_numpy(x)
print(y)

[1. 1. 1.]
tensor([1., 1., 1.], dtype=torch.float64)

np.add(x,2,out=x)
print(x)
print(y)

[3. 3. 3.]
tensor([3., 3., 3.], dtype=torch.float64)

x=np.add(x,1)
print(x)
print(y)

[4. 4. 4.]
tensor([3., 3., 3.], dtype=torch.float64)

根据上面两块代码，可以知道用out才是存到原来的变量里，而x=np.add()则是重新生成一个新的变量叫x

四.使用GPU

使用GPU进行加速，则需要将所有tensor转移到gpu上，即模型（模型的参数也是tensor）和模型输入的数据

判断gpu是否可用,结果为true则可用

torch.cuda.is_available()

True

4.1 tensor转移到gpu的两种方法

使用.to()将tensor转移到gpu上去
device=torch.device(“cuda:0” if torch.cuda.is_available() else “cpu”)
model=model.to(device)
x=x.to(device)
y=y.to(device)
使用.cuda()转移上去
model=model.cuda()
x=x.cuda()
y=y.cuda()

下面以.to()将tensor转移到gpu上去运行进行举例

x=torch.rand(3)
y=torch.rand(3)
device=torch.device("cuda")
y=torch.ones_like(y,device=device)#cuda是英伟达的一个gpu运算库，此处将tensor放到cuda上（本电脑的gpu就是英伟达的）
x=x.to(device)#将x搬到gpu上去
z=x+y # 此处x和y都得是在gpu上
z

tensor([1.9240, 1.4169, 1.3466], device='cuda:0')

再把z从gpu搬到cpu上，并且此刻还能转类型

print(z.to("cpu",torch.double))

tensor([1.9240, 1.4169, 1.3466], dtype=torch.float64)

注意：若tensor是在cpu上，可以直接住转为numpy;若tensor在gpu上则需要转到cpu上再转为numpy,因为numpy是在cpu上操作的库

x=torch.rand(3)
print(x.numpy())
if torch.cuda.is_available():
    device=torch.device("cuda")
    x=x.to(device)
    print(x)
    #下面这句会报这样的错误：TypeError: can't convert CUDA tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
    #print(x.numpy())
    x.to("cpu")
    #再转回cpu就又可以用了
    print(x)

[0.1902067  0.26630002 0.84085023]
tensor([0.1902, 0.2663, 0.8409], device='cuda:0')
tensor([0.1902, 0.2663, 0.8409], device='cuda:0')

五.两层神经网络简单练习

5.1 pytorch求梯度

PyTorch: Tensor和autograd

PyTorch的一个重要功能就是autograd，也就是说只要定义了forward pass(前向神经网络)，计算了loss之后，PyTorch可以自动求导计算模型所有参数的梯度。

一个PyTorch的Tensor表示计算图中的一个节点。如果x是一个Tensor并且x.requires_grad=True那么x.grad是另一个储存着x当前梯度(相对于一个scalar，常常是loss)的向量。

5.1.1 手动求梯度

N,D_in,H,D_out=64,1000,100,10
#随机创建一些训练数据
x=torch.randn(N,D_in)
y=torch.randn(N,D_out)

w1=torch.randn(D_in,H)
w2=torch.randn(H,D_out)

learning_rate=1e-6
for it in range(500):
    #forward pass
    h=x.mm(w1)
    h_relu=h.clamp(min=0)
    y_pred=h_relu.mm(w2)
    
    #compute loss
    loss=(y_pred-y).pow(2).sum().item()
    print(it,loss)
    
    #backward pass
    #compute the gradient
    grad_y_pred=2.0*(y_pred-y)
    grad_w2=h_relu.t().mm(grad_y_pred)
    grad_h_relu=grad_y_pred.mm(w2.t())
    grad_h=grad_h_relu.clone()
    grad_h[h<0]=0
    grad_w1=x.t().mm(grad_h)
    
    #update weights of w1 and w2
    w1-=learning_rate*grad_w1
    w2-=learning_rate*grad_w2

0 42958784.0
1 46169804.0
2 48758020.0
(省略)
496 0.00026750023243948817
497 0.0002627377980388701
498 0.0002574215177446604
499 0.0002527687174733728

5.1.2 pytorch自动算gradient

求梯度（导数）例子

注：根据grad修改参数值后，必须把参数梯度清0，因为pytorch里不清0的话，会在下次计算参数梯度时把上次的梯度也带上

x=torch.tensor(1.0,requires_grad=True)
w=torch.tensor(2.0,requires_grad=True)
b=torch.tensor(3.0,requires_grad=True)
y=w*x+b # y = 2*1 + 3

#求y的所有中间参数的梯度
y.backward()
#求 y 对 x的导数，即dy/dx
print("dy/dx:",x.grad)

#求 y 对 w的导数，即dy/dw
print("dy/dw:",w.grad)

#求 y 对 b的导数，即dy/db
print("dy/db:",b.grad)

dy/dx: tensor(2.)
dy/dw: tensor(1.)
dy/db: tensor(1.)

N,D_in,H,D_out=64,1000,100,10
#随机创建一些训练数据
x=torch.randn(N,D_in)
y=torch.randn(N,D_out)

w1=torch.randn(D_in,H,requires_grad=True)
w2=torch.randn(H,D_out,requires_grad=True)

learning_rate=1e-6
for it in range(500):
    #forward pass
    y_pred=x.mm(w1).clamp(min=0).mm(w2)
    
    #compute loss
    loss=(y_pred-y).pow(2).sum()
    print(it,loss.item())
    
    #backward pass,compute the gradient
    loss.backward()#求loss所有中间参数即w1和w2的grad
    
    #update weights of w1 and w2
    with torch.no_grad():#这行语句是使内存不记录w1和w2的梯度，要不会影响以后求梯度的结果
        w1 -= learning_rate * w1.grad
        w2 -= learning_rate * w2.grad
        #根据grad修改w1和w2的值后，必须把梯度清0，因为pytorch里不清0的话，会在下次计算梯度时把上次的梯度也带上
        w1.grad.zero_()
        w2.grad.zero_()

0 33515288.0
1 28960210.0
2 25787898.0
(省略)
497 7.83200521254912e-05
498 7.704120071139187e-05
499 7.612317858729511e-05

5.2 使用pytorch的nn(neural network)库

import torch.nn as nn

N,D_in,H,D_out=64,1000,100,10

#随机创建一些训练数据
x=torch.randn(N,D_in)
y=torch.randn(N,D_out)

model=torch.nn.Sequential(
    torch.nn.Linear(D_in,H),# w1*x+b1
    torch.nn.ReLU(),
    torch.nn.Linear(H,D_out),
)
#转到gpu上
#model=model.cuda()

loss_fn=nn.MSELoss(reduction='sum')

learning_rate=1e-3
for it in range(500):
     #forward pass
    y_pred=model(x)
    
    #compute loss
    loss=loss_fn(y_pred,y)
    print(it,loss.item())
    
    #backward pass,compute the gradient
    loss.backward()#求loss所有中间参数即w1和w2的grad
    
    #update weights of w1 and w2
    with torch.no_grad():#这行语句是使内存不记录w1和w2的梯度，要不会影响以后求梯度的结果
        #模型的所有参数都在parameters里
        for param in model.parameters():
            param -= learning_rate * param.grad
        
    #清零梯度,清零所有参数,下一次求梯度前清零
    model.zero_grad()

0 612.6693725585938
1 300.2776794433594
2 176.84945678710938
(省略)
497 1.6442732592159004e-12
498 1.673319469097656e-12
499 1.648422717770437e-12

model

Sequential(
  (0): Linear(in_features=1000, out_features=100, bias=True)
  (1): ReLU()
  (2): Linear(in_features=100, out_features=10, bias=True)
)

5.3 使用优化器更新参数(optim)

import torch.nn as nn

N,D_in,H,D_out=64,1000,100,10

#随机创建一些训练数据
x=torch.randn(N,D_in)
y=torch.randn(N,D_out)

model=torch.nn.Sequential(
    torch.nn.Linear(D_in,H),# w1*x+b1
    torch.nn.ReLU(),
    torch.nn.Linear(H,D_out),
)
#转到gpu上
#model=model.cuda()

loss_fn=nn.MSELoss(reduction='sum')

learning_rate=1e-4

#定义优化器
optimizer=torch.optim.Adam(model.parameters(),lr=learning_rate)

for it in range(500):
     #forward pass
    y_pred=model(x)#model(x)自动等于model.forward(x)
    
    #compute loss
    loss=loss_fn(y_pred,y)
    print(it,loss.item())
    
    #backward pass,compute the gradient
    loss.backward()#求loss所有中间参数即w1和w2的grad
    
    #更新参数
    optimizer.step()
    
    #梯度清零
    optimizer.zero_grad()

0 676.9285888671875
1 659.3667602539062
2 642.3128051757812
(省略)
497 9.99489770947548e-07
498 9.476103173255979e-07
499 8.98813595995307e-07

5.4 使用类的方法写模型(标准写法)

import torch.nn as nn

N,D_in,H,D_out=64,1000,100,10

#随机创建一些训练数据
x=torch.randn(N,D_in)
y=torch.randn(N,D_out)

class TwoLayerNet(torch.nn.Module):#继承nn.Module
    def __init__(self,D_in,H,D_out):#把有导数的层放到init里，定义模型框架
        super(TwoLayerNet,self).__init__()#用super方法初始化
        self.linear1=torch.nn.Linear(D_in,H,bias=False)
        self.linear2=torch.nn.Linear(H,D_out,bias=False)
        
    def forward(self,x):
        y_pred=self.linear2(self.linear1(x).clamp(min=0))
        return y_pred

model=TwoLayerNet(D_in,H,D_out)

#转到gpu上
device=torch.device("cuda")
model=model.cuda()
x=x.to(device)
y=y.to(device)

loss_fn=nn.MSELoss(reduction='sum')

learning_rate=1e-4

#定义优化器
optimizer=torch.optim.Adam(model.parameters(),lr=learning_rate)

for it in range(500):
     #forward pass
    y_pred=model(x)
    
    #compute loss
    loss=loss_fn(y_pred,y)
    print(it,loss.item())
    
    #backward pass,compute the gradient
    loss.backward()#求loss所有中间参数即w1和w2的grad
    
    #更新参数
    optimizer.step()
    
    #梯度清零
    optimizer.zero_grad()

0 666.7147216796875
1 649.9814453125
2 633.6069946289062
(省略)
497 8.311786103831764e-08
498 7.800667845003773e-08
499 7.324160833377391e-08

六写神经网络过程总结

写一个神经网络的大体流程：

1.加载数据，并确定数据格式
2.创建模型
3.构造损失函数
4.定义优化器
5.开始训练，确定训练批次大小和总训练数据的训练次数

forward pass
compute loss
backward pass
update weights
clear gradient

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。

上一篇：mysql 递归 with mysql递归sql

下一篇：python 多个windows窗口合并 python合并多个word

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯