pytorch检测程序 pytorch异常检测

转载

mob64ca14101b2f 2023-12-20 22:10:04

文章标签 pytorch检测程序 bug debug CUDA 解决方法 文章分类 PyTorch 人工智能

在调试bug中提高自己，送给所有调试bug迷茫的朋友们

1.需要进行类型转换：RuntimeError: Found dtype Long but expected Float

即发现dtype是Long，但是期待的是Float

RuntimeError: Found dtype Long but expected Float

将得到的loss值进行类型转换

解决方法：loss = torch.tensor(loss, dtype=float)

或者以下方法也可以：在传入loss之前就改变类型，那么得到的loss值也会是对应改变的类型

loss = criterion(outputs.float(), labels.float())

2.需要进行requires_grad_()设置，RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

这个是因为没有将requires_grad_()设为True，L=LOSS（out，label）中的L默认是requires_grad_()为false，这个L其实也是一个张量Tensor类型，将其requires_grad_()改为True后，使用backward函数就可以得到requires_grad_()为True的所有参数的梯度

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

解决方法：loss = criterion(outputs, labels);loss.requires_grad_()
或者loss = torch.tensor(loss, dtype=float,requires_grad=True)都可以解决该问题

注意：

loss.requires_grad(required_grad=True)这样是不行的，因为loss.requires_grad是一个布尔值，会报错

loss.requires_grad(required_grad=True)

其次，

loss.requires_grad=True也会报错，意思是没有在叶子变量的requires_grad标志修改

RuntimeError: you can only change requires_grad flags of leaf variables.

3. Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same

在运行torch中出现这个错误。错误内容大概就是指输入类型是CPU（torch.FloatTensor），而参数类型是GPU（torch.cuda.FloatTensor）。

首先，请先检查是否正确使用了CUDA。

通常我们这样指定使用CUDA：

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
inputs.to(device)

这样就把input这个tensor转换成了CUDA 类型。但是我们还是出错。就是输出本篇博文的标题。

正确的做法是：

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
inputs = inputs.to(device)

原因：
tensor.to() 这个函数功能是产生一个新的tensor，并不会改变原数据。

但是，注意到
Module.to() 是一个“in-place”方法，tensor.to() 函数不是。

友情提示：
pytorch中要注意是否是“in-place”。

4. AttributeError: 'list' object has no attribute 'mean'问题解决

np.mean(arrlist,axis=1)

5.Cannot cast ufunc subtract output from dtype(‘float64’) to dtype(‘int64’) with casting rule ‘same_kind’

Problem:

You are trying to do a simple arithmetic operation on a NumPy array but you see an error message like

TypeError: Cannot cast ufunc subtract output from dtype('float64') to dtype('int64') with casting rule 'same_kind'

Solution:

You are trying to substract a float from an int64 array. This does not work with operators like += or -=

Example:

import numpy as np
data = np.asarray([1, 2, 3, 4], dtype=np.int64) # This is an int array!
print(data - 5) # This works
print(data - 5.0) # This works as well
# This raises: Cannot cast ufunc subtract output from dtype('float64') to dtype('int64') with casting rule 'same_kind'
data -= 5.0

Option 1 (preferred):

Use - instead of -=: Instead of data -= 5.0 use data = data - 5.0

Option 2:

Explicitly cast data to float (or the first dtype of your error message):

data = data.astype('float64')
# Now this works
data -= 5.0

This option is not preferred since doing it requires using the correct datatype. The first option works without regarding the actual datatype.

6.TypeError: can't convert np.ndarray of type numpy.uint16.

解决方案：

label = label/1.0

7.RuntimeError: CUDA error: out of memory

网上也有其他的问题解决方案，但是都没有解决我的问题，我的解决方法是：

net.to(device=device)

我的问题出现的原因是在测试我的神经网络时，出现了以上错误，原因是由于我在定义神经网络时，在加载网络参数前，需要将神经网络放到GPU上，因此在定义好网络后，就需要将网络放置在GPU上。问题解决！

8.TypeError: only integer scalar arrays can be converted to a scalar index

出现该问题的原因是未正常处理数组,解决方案:

height = np_mask.size(0)
width = np_mask.size(1)

# 将上面代码变换成下面代码:
height = np.shape(np_mask)[0]
width = np.shape(np_mask)[1]

9.报错：UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 1022-1023: unexpected end of data

项目里面带有之前debug留下的.idea和__pycache__文件夹，把这两个文件夹删除再重新导入一下项目就好了

10.报错：RuntimeError: Error(s) in loading state_dict for MultiScale Guide: Missing key(s) in state_dict.

checkpoint_file = os.path.join(args.checkpoint, args.test+'.pth.tar')
checkpoint = torch.load(checkpoint_file)
model.load_state_dict(checkpoint['state_dict'])

上面是出错的代码，解决方法是在最后一行的括号里面加上False,如下：

checkpoint_file = os.path.join(args.checkpoint, args.test+'.pth.tar')
checkpoint = torch.load(checkpoint_file) 
model.load_state_dict(checkpoint['state_dict'],False) # 修改处

model.load_state_dict(state_dict, strict=True)

Copies parameters and buffers from :attr:state_dict into this module and its descendants. If :attr:strict is True, then the keys of :attr:state_dict must exactly match the keys returned by this module’s :meth:~torch.nn.Module.state_dict function
从属性state_dict里面复制参数到这个模块和它的后代。如果strict为True, state_dict的keys必须完全与这个模块的方法返回的keys相匹配。如果为False,就不需要保证匹配。

Arguments:
state_dict (dict): a dict containing parameters and persistent buffers.
strict (bool, optional): whether to strictly enforce that the keys in :attr:state_dict match the keys returned by this module’s:meth:~torch.nn.Module.state_dict function. Default: True

11.RuntimeError: DataLoader worker (pid 27597) is killed by signal: Terminated.

解决方案：将num_workers设置为小的值即可

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。