one of the variables needed for gradient computation has been modified by an inplace operation:

原创

漫浸天空的雨色 2022-09-15 11:07:07 博主文章分类：炼丹随笔 ©著作权

©著作权归作者所有：来自51CTO博客作者漫浸天空的雨色的原创作品，请联系作者获取转载授权，否则将追究法律责任

今天跑网络在进行loss.backward()的时候，出现了如下错误：

one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [2, 64]], which is output 0 of ViewBackward, is at version 21;

expected version 20 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

我们先来看看错误是什么吧，就是说一个需要计算梯度的变量被原地置换了，网络希望是根据第20版的variable计算梯度，但拿到的却是第21版的，根据提示添加torch.autograd.set_detect_anomaly(True)仍然会报错

贴一下我的网络与问题相关的结构：

input_decoder = self.Decoder_init_input.repeat(batch_size, 1).unsqueeze(1)
for i in range(total_len):
    _, _ =  self.Decoder(input_decoder, hidden_decoder)
    input_decoder[index,:,:]= data[index,:]

不用管其他参数，只要看input_decoder就行，这里input_decoder每次循环不仅作为输入，每次loop的第二步还要改变其值，再进入下一次循环

这种做法在我们看来很合理，但是网络是不能接受的，你每次都在原地改变一个变量，那让torch如何去回溯这个变量的梯度呢？

所以，正确的做法应该是使用不同的变量去存储每次的结果，或者说，每次输入self.Decoder()的不应该都是同一个input_decoder，最简单的方式就是进行如下改动：

input_decoder = self.Decoder_init_input.repeat(batch_size, 1).unsqueeze(1)
input_decoder_collector = [input_decoder]
for i in range(total_len):
    _, _ =  self.Decoder(input_decoder_collector[i], hidden_decoder)
    temp=input_decoder_collector[i].clone()
    temp[index,:,:]= data[index,:]
    input_decoder_collector.append(temp)

这样的话网络就可以回溯到每次循环所输入的input_decoder了，当然会占用更多的内存，可能还有更加巧妙地修改方式，对此有见解的大佬欢迎评论区交流指教呀(๑•̀ㅂ•́)و✧~

上一篇：二战浙大CS失败+上岸上科大的一些经验

下一篇：VS code 用remote-ssh 远程连接win10 出现 Could not establish connection to host

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯