Pytorch入门（5）—— 使用 GPU 进行计算

原创

云端FFF 2022-11-22 10:44:32 博主文章分类：PyTorch ©著作权

文章标签 pytorch GPU tensor module 数据 文章分类 PyTorch 人工智能

©著作权归作者所有：来自51CTO博客作者云端FFF的原创作品，请联系作者获取转载授权，否则将追究法律责任

参考：动手学深度学习
注意：由于本文是jupyter文档转换来的，代码不一定可以直接运行，有些注释是jupyter给出的交互结果，而非运行结果!!

文章目录

1. 计算设备
2. Tensor 的 GPU 计算
3. Module 的 GPU 计算

1. 计算设备

打开 CMD 窗口，使用 nvidia-smi 指令查看本地 GPU 信息
对复杂的神经网络和大规模的数据来说，使用CPU来计算可能不够高效，PyTorch 可以指定用来存储和计算的设备，如 内存+CPU 或者 显存+GPU。默认情况下，PyTorch 会将数据创建在内存并利用 CPU 计算
使用 GPU 前需要安装必要的驱动库，如果使用 Nvida 显卡，需安装 Nvida 的 Cuda 和 CuDNN 组件（注意组件版本要和你的 pytorch 版本以及显卡型号匹配），可以参考 CUDA11.4、CUDNN、Pytorch安装。安装好之后执行以下代码检查

import torch
# 查看 pytorch 版本
print(torch.__version__)             # 1.10.2
# 查看 GPU 是否可用
print(torch.cuda.is_available())     # True
# 查看GPU数量，索引号从0开始
print(torch.cuda.current_device())   # 0
# 根据索引号查看GPU名字
print(torch.cuda.get_device_name(0)) # NVIDIA GeForce GTX 1070

2. Tensor 的 GPU 计算

tenor 默认存储在内存中供 CPU 使用，通过访问其 .device 成员查看其运行的设备，要将某个 CPU 上的 tensor 转换（复制）到 GPU 上，可以

调用 tenor 对象的 .cuda() 方法
如果有多块 GPU，可以用 .cuda(i) 来指定转移到第 i 块 GPU 及相应的显存（i从0开始）注意 .cuda(0) 和 .cuda() 等价
创建 tensor 时，可以通过 device 参数指定其运行的设备，或连续调用 .to() 方法指定设备

对于一个已经在 GPU 及相应显存的 tensor 对象，调用其 .cpu() 方法将其移回 CPU 和内存

x = torch.tensor([1, 2, 3])
print(x.device) # cpu

x = x.cuda(0)
print(x.device) # cuda:0

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
y = torch.tensor([1, 2, 3], device=device)
print(y.device) # cuda:0

z = torch.tensor([1, 2, 3]).to(device)
print(z.device) # cuda:0

z = z.cpu()
print(z.device) # cpu

只有在同一个设备上的 tensor 间才能进行运算，运算结果仍然存储在对应的设备上；如果两个数据存储的位置不一致（CPU 和 GPU 之间、不同 GPU 之间），直接计算会报 RuntimeError

x = torch.tensor([1, 2, 3]).cuda(0)
y = torch.tensor([1, 2, 3])
print('x:',x.device,'\ny:', y.device)

# x + y   `RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

3. Module 的 GPU 计算

用 Module 的 .parameters() 拿出 tensor 参数后，可以和像 2 节中那样进行操作
和 Tensor 类似，PyTorch Module 对象也是默认在 CPU 和内存上，可以使用类似的方法转换到 GPU

调用 Module 对象的 .cuda() 方法
如果有多块 GPU，可以用 .cuda(i) 来指定转移到第 i 块 GPU 及相应的显存（i从0开始）注意 .cuda(0) 和 .cuda() 等价
创建 tensor 时，可以通过 device 参数指定其运行的设备，或连续调用 .to() 方法指定设备

对于一个已经在 GPU 及相应显存的 Module 对象，调用其 .cpu() 方法将其移回 CPU 和内存

from torch import nn

net = nn.Linear(3, 1)
print(list(net.parameters())[0].device) # cpu

net.cuda(0)
print(list(net.parameters())[0].device) # cuda:0

net2 = nn.Linear(3, 1, device = torch.device('cuda' if torch.cuda.is_available() else 'cpu'))
print(list(net2.parameters())[0].device) # cuda:0

net3 = nn.Linear(3, 1).to(torch.device('cuda' if torch.cuda.is_available() else 'cpu'))
print(list(net3.parameters())[0].device) # cuda:0

net3 = net3.cpu()
print(list(net3.parameters())[0].device) # cpu

同样的，计算时保证模型输入的 Tensor 和模型都在同一设备上，否则会报错 RuntimeError

net = nn.Linear(3, 1).cuda(0)
x = torch.rand(2,3).cuda(0)
print(net(x))

net = net.cpu()
#print(net(x))  RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument mat1 in method wrapper_addmm)