在调试bug中提高自己,送给所有调试bug迷茫的朋友们

1.需要进行类型转换:RuntimeError: Found dtype Long but expected Float

即发现dtype是Long,但是期待的是Float

RuntimeError: Found dtype Long but expected Float

将得到的loss值进行类型转换 

解决方法:loss = torch.tensor(loss, dtype=float)

或者以下方法也可以:在传入loss之前就改变类型,那么得到的loss值也会是对应改变的类型

loss = criterion(outputs.float(), labels.float())

 

2.需要进行requires_grad_()设置,RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

这个是因为没有将requires_grad_()设为True,L=LOSS(out,label)中的L默认是requires_grad_()为false,这个L其实也是一个张量Tensor类型,将其requires_grad_()改为True后,使用backward函数就可以得到requires_grad_()为True的所有参数的梯度

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
解决方法:loss = criterion(outputs, labels);loss.requires_grad_()
或者loss = torch.tensor(loss, dtype=float,requires_grad=True)都可以解决该问题

注意:

loss.requires_grad(required_grad=True)这样是不行的,因为loss.requires_grad是一个布尔值,会报错
loss.requires_grad(required_grad=True)

其次,

loss.requires_grad=True也会报错,意思是没有在叶子变量的requires_grad标志修改
RuntimeError: you can only change requires_grad flags of leaf variables.

3. Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same

在运行torch中出现这个错误。错误内容大概就是指输入类型是CPU(torch.FloatTensor),而参数类型是GPU(torch.cuda.FloatTensor)。

首先,请先检查是否正确使用了CUDA。

通常我们这样指定使用CUDA:

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
inputs.to(device)

这样就把input这个tensor转换成了CUDA 类型。但是我们还是出错。就是输出本篇博文的标题。

正确的做法是:

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
inputs = inputs.to(device)

原因:
tensor.to() 这个函数功能是产生一个新的tensor,并不会改变原数据。

但是,注意到
Module.to() 是一个“in-place”方法,tensor.to() 函数不是。

友情提示:
pytorch中要注意是否是“in-place”。

4. AttributeError: 'list' object has no attribute 'mean'问题解决

np.mean(arrlist,axis=1)

5.Cannot cast ufunc subtract output from dtype(‘float64’) to dtype(‘int64’) with casting rule ‘same_kind’

Problem:

You are trying to do a simple arithmetic operation on a NumPy array but you see an error message like

TypeError: Cannot cast ufunc subtract output from dtype('float64') to dtype('int64') with casting rule 'same_kind'

Solution:

You are trying to substract a float from an int64 array. This does not work with operators like += or -=

Example:

import numpy as np
data = np.asarray([1, 2, 3, 4], dtype=np.int64) # This is an int array!
print(data - 5) # This works
print(data - 5.0) # This works as well
# This raises: Cannot cast ufunc subtract output from dtype('float64') to dtype('int64') with casting rule 'same_kind'
data -= 5.0

Option 1 (preferred):

Use - instead of -=: Instead of data -= 5.0 use data = data - 5.0

Option 2:

Explicitly cast data to float (or the first dtype of your error message):

data = data.astype('float64')
# Now this works
data -= 5.0

This option is not preferred since doing it requires using the correct datatype. The first option works without regarding the actual datatype.

6.TypeError: can't convert np.ndarray of type numpy.uint16.

解决方案:

label = label/1.0

7.RuntimeError: CUDA error: out of memory

网上也有其他的问题解决方案,但是都没有解决我的问题,我的解决方法是:

net.to(device=device)

我的问题出现的原因是在测试我的神经网络时,出现了以上错误,原因是由于我在定义神经网络时,在加载网络参数前,需要将神经网络放到GPU上,因此在定义好网络后,就需要将网络放置在GPU上。问题解决!

8.TypeError: only integer scalar arrays can be converted to a scalar index

出现该问题的原因是未正常处理数组,解决方案:

height = np_mask.size(0)
width = np_mask.size(1)

# 将上面代码变换成下面代码:
height = np.shape(np_mask)[0]
width = np.shape(np_mask)[1]

9.报错:UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 1022-1023: unexpected end of data

项目里面带有之前debug留下的.idea和__pycache__文件夹,把这两个文件夹删除再重新导入一下项目就好了

10.报错:RuntimeError: Error(s) in loading state_dict for MultiScale Guide: Missing key(s) in state_dict.

checkpoint_file = os.path.join(args.checkpoint, args.test+'.pth.tar')
checkpoint = torch.load(checkpoint_file)
model.load_state_dict(checkpoint['state_dict'])

上面是出错的代码,解决方法是在最后一行的括号里面加上False,如下:

checkpoint_file = os.path.join(args.checkpoint, args.test+'.pth.tar')
checkpoint = torch.load(checkpoint_file) 
model.load_state_dict(checkpoint['state_dict'],False) # 修改处

model.load_state_dict(state_dict, strict=True)

Copies parameters and buffers from :attr:state_dict into this module and its descendants. If :attr:strict is True, then the keys of :attr:state_dict must exactly match the keys returned by this module’s :meth:~torch.nn.Module.state_dict function
从属性state_dict里面复制参数到这个模块和它的后代。如果strict为True, state_dict的keys必须完全与这个模块的方法返回的keys相匹配。如果为False,就不需要保证匹配。

Arguments:
state_dict (dict): a dict containing parameters and persistent buffers.
strict (bool, optional): whether to strictly enforce that the keys in :attr:state_dict match the keys returned by this module’s:meth:~torch.nn.Module.state_dict function. Default: True

11.RuntimeError: DataLoader worker (pid 27597) is killed by signal: Terminated.

解决方案:将num_workers设置为小的值即可