FCN训练 pytorch pytorch fp16训练

转载

mob6454cc780924 2024-06-07 06:40:02

文章标签 FCN训练 pytorch 取整初始化 2d 文章分类 PyTorch 人工智能

权重的初始化

一般使用nn.init.xavier_uniform_()初始化权重
用nn.init.constant_(i.bias,0)初始化偏置，置为0
具体目前不太懂，先记着

for i in model.modules():
    # i和nn.Conv2d类型一致
    if isinstance(i,nn.Conv2d):
        # xavier_uniform 一种初始化方法
        nn.init.xavier_uniform_(i.weight)
        if i.bias is not None:
            # .constant 用一个数填充
            nn.init.constant_(i.bias,0)
    elif isinstance(i,nn.Linear):
        nn.init.xavier_uniform_(i.weight)
        nn.init.constant_(i.bias,0)

加载搭建好的backbone的部分模型并修改步长

都是经过test后的一点猜测
用list(model.children())来将实例化的模型转化成列表访问
列表里每一个元素都是forward里的一个self.xxxx()的实例化操作
Sequential 作为一个嵌套列表
但是遇到了定义的Bottleneck类就不能继续访问了，必须再次使用model.children()打开

from resnet_50 import resnet_50

model = resnet_50(224)
# model得是一个继承自nn.Module的一个类 用model.children()来打开
# list(model.children())转换成list
resnet_50_d = list(model.children())
print(resnet_50_d)
# 该list的每一元素对应于resnet_50类里forward的每一个实例操作self.xxxx(x)
print(resnet_50_d[0])
# nn.Sequential()也可以被作为list中嵌套的一个list访问
print(resnet_50_d[3][0])
# 但是遇到了定义的Bottleneck类就不能继续访问了，必须再次使用model.children()打开
# print(resnet_50_d[3][0][0])
bottleneck = resnet_50_d[3][0]
bottleneck_d = list(bottleneck.children())
print(bottleneck_d)
# 访问到nn.conv等pytorch的基本类，就可以修改stride的值了
bottleneck_d[0].stride = (3,3)
print(bottleneck_d)

卷积计算公式

N = (W − F + 2P )/S+1
卷积向下取整，池化向上取整
Maxpool2d中ceil_mode = True 时向上取整

boxes格式

[batch, 4, num_anchors]

获取损失计算的正样本

先对所有损失降序再升序，在进行lt判断得到的bool可以确定前neg_num个损失最大的参数

mask = torch.gt(g_label, 0)

# 选损失值最大的一部分作为负样本
# 对所有损失降序再升序
_, conf_idx = conf_all_loss.sort(dim=1, descending=True)
_, conf_rank = conf_idx.sort(dim=1)
# 前negative_num的掩摸
negative_mask = torch.lt(conf_rank, negative_num)

交换维度

permute:
permute（a,b,c,d）原来的a维度作为新的0维度，以此类推

a=torch.rand(4,3,28,32)
a.permute(0,2,3,1).shape

torch.Size([4,28,32,3])

transpose：
transpose：交换两个维度

a=torch.rand(4,3,28,32)
a.transpose(1,3).shape

torch.Size([4,32,28,3])

拆分Tensor

tensor.split(大小，维度）

x = tensor([[0.3671, 0.3908, 0.0952, 0.7797, 0.8647, 0.4163, 0.2684, 0.0373],
    	    [0.4714, 0.8399, 0.4101, 0.5223, 0.8602, 0.5641, 0.0650, 0.4025]])
        
x.split(2,1)

tensor([[0.3671, 0.3908],
        [0.4714, 0.8399]])
tensor([[0.0952, 0.7797],
        [0.4101, 0.5223]])
tensor([[0.8647, 0.4163],
        [0.8602, 0.5641]])
tensor([[0.2684, 0.0373],
        [0.0650, 0.4025]])

扩充

推荐tenor.repeat（a,b,c）对应维度重复a或b或c次

a.repeat(1,1).size()     # 原始值：torch.Size([33, 55])
torch.Size([33, 55])
 
a.repeat(2,1).size()     # 原始值：torch.Size([33, 55])
torch.Size([66, 55])

a.repeat(1,2).size()     # 原始值：torch.Size([33, 55])
torch.Size([33, 110])

a.repeat(1,2,1).size()   # 原始值：torch.Size([33, 55])
torch.Size([1, 66, 55])

筛选

torch.gt() 和 torch.nonzero()
torch.gt()一般用作掩摸，返回的是和输入一样尺寸的bool矩阵

x = torch.Tensor([[2,1,4,0,2,4,5],[3,2,0,7,0,3,2]])
y = torch.gt(x,1)

tensor([[ True, False,  True, False,  True,  True,  True],
        [ True,  True, False,  True, False,  True,  True]])

torch.nonzero()返回满足条件的坐标，一般用来剔除不需要的
当as_tuple = True是,按每个维度返回，可以用来筛选

x = torch.Tensor([[2,1,4,0,2,4,5],[3,2,0,7,0,3,2]])
y = torch.nonzero(x>3)

tensor([[0, 2],
        [0, 5],
        [0, 6],
        [1, 3]])

y = torch.nonzero(x>3,as_tuple=True)

(tensor([0, 0, 0, 1]), tensor([2, 5, 6, 3]))

print(x[y])
tensor([4., 4., 5., 7.])

将pytorch模型转换成onnx模型

torch.onnx.export（model ，input，verbose）
model ：要转换的模型（先加载好权重）
input：模型需要输入
verbose：是否打印出一个导出轨迹的调试描述

首先要将模型转换为推理模式：model.eval（)
随机生成一个输入：dummy_input=torch.randn(32, 3, 224, 224)
生成onnx文件：torch.onnx.export(model, dummy_input, “model.onnx”, verbose=False)

onnx模型转换成trt文件

使用以下命令

#USE_FP16表示是否使用fp16精度
# 假设之前保存的文件为resnet50_pytorch.onnx
if USE_FP16:
    !trtexec --onnx=resnet50_pytorch.onnx --saveEngine=resnet_engine_pytorch.trt  --explicitBatch --inputIOFormats=fp16:chw --outputIOFormats=fp16:chw --fp16
else:
    !trtexec --onnx=resnet50_pytorch.onnx --saveEngine=resnet_engine_pytorch.trt  --explicitBatch

使用trt文件进行推理

#
import tensorrt as trt
import pycuda.driver as cuda
import pycuda.autoinit

f = open("resnet_engine_pytorch.trt", "rb")
runtime = trt.Runtime(trt.Logger(trt.Logger.WARNING)) 

engine = runtime.deserialize_cuda_engine(f.read())
context = engine.create_execution_context()

import numpy as np

# 设置好输入和输出精度
output = np.empty([BATCH_SIZE, 1000], dtype = target_dtype) 

# 分配内存
d_input = cuda.mem_alloc(1 * input_batch.nbytes)
d_output = cuda.mem_alloc(1 * output.nbytes)

bindings = [int(d_input), int(d_output)]

stream = cuda.Stream()

def predict(batch): # result gets copied into output
    # 把输入复制到GPU
    cuda.memcpy_htod_async(d_input, batch, stream)
    # 运行模型
    context.execute_async_v2(bindings, stream.handle, None)
    # 输出复制到CPU
    cuda.memcpy_dtoh_async(output, d_output, stream)
    # 同步线程？
    stream.synchronize()
    
    return output
# 调用函数就能得到模型的输出值，输入的是一个batch的图片
pred = predict(preprocessed_images)

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。