RuntimeError

跑代码的时候，这段代码遇见了报错：

 def train_emb(self, images, captions, lengths, image_lengths=None, warmup_alpha=None):
        """One training step given images and captions."""
        self.Eiters += 1
        self.logger.update('Eit', self.Eiters)
        self.logger.update('lr', self.optimizer.param_groups[0]['lr'])

        captions_all = captions.reshape(captions.size(0) * captions.size(1), captions.size(2))
        caption_lens = lengths.reshape(-1)

        # compute the embeddings    #(256, 1024)
        img_emb, cap_emb = self.forward_emb(images, captions_all, caption_lens, image_lengths=image_lengths)

        # measure accuracy and record loss
        self.optimizer.zero_grad()
        loss = self.forward_loss(img_emb, cap_emb)

        if warmup_alpha is not None:
            loss = loss * warmup_alpha

        # compute gradient and update
        loss.backward(retain_graph=True)

        # Adversarial Training
        img_real = img_emb.detach() # img_real(256,1024)
        cap_real = cap_emb.detach() # cap_real(256,1024)

        # Generate fake embeddings
        img_fake = self.img_gen(cap_emb).detach()   # cap_emb(256,1024), img_fake(256,2048)
        cap_fake = self.txt_gen(img_emb).detach()   #img_emb(256,1024),  cap_fake(256,1024)

        # Train discriminators
        img_real.requires_grad = True
        img_fake.requires_grad = True
        cap_real.requires_grad = True
        cap_fake.requires_grad = True

        disc_img_real = self.img_disc(img_real)  # img_real(256,1024) disc_img_real()
        disc_img_fake = self.img_disc(img_fake)  #img_fake(256,1024)  disc_img_fake()

        disc_cap_real = self.txt_disc(cap_real)  #cap_real(256,1024) dis_cap_real()
        disc_cap_fake = self.txt_disc(cap_fake)  #cap_fake(256,1024) dis_cap_fake()

        disc_loss_img = self.gan_criterion(disc_img_real, True) + self.gan_criterion(disc_img_fake, False)
        disc_loss_cap = self.gan_criterion(disc_cap_real, True) + self.gan_criterion(disc_cap_fake, False)
        total_disc_loss = disc_loss_img + disc_loss_cap

        total_disc_loss.backward(retain_graph=True)

        clip_grad_norm_(self.params, self.grad_clip)
        self.optimizer.step()

        # Train generators
        #self.gen_optim.zero_grad()
        img_fake.requires_grad = False
        cap_fake.requires_grad = False

        self.optimizer.zero_grad()  # Clear gradients for generator training

        gen_img = self.img_gen(cap_emb) #gen_img(256,1024)
        gen_cap = self.txt_gen(img_emb)#gen_cap(256,1024)

        disc_img_fake_for_gen = self.img_disc(gen_img) #disc_img_fake_for_gen(256,1)
        disc_cap_fake_for_gen = self.txt_disc(gen_cap) #disc_cap_fake_for_gen(256,1)

        gen_loss_img = self.gan_criterion(disc_img_fake_for_gen, True)
        gen_loss_cap = self.gan_criterion(disc_cap_fake_for_gen, True)
        total_gen_loss = gen_loss_img + gen_loss_cap

        total_gen_loss.backward()

        clip_grad_norm_(self.params, self.grad_clip)
        self.optimizer.step()

debug调试发现报错在

total_gen_loss.backward()

报错内容是：

Traceback (most recent call last):
  File "train.py", line 274, in <module>
    main()
  File "train.py", line 99, in main
    train(opt, train_loader, model, epoch, val_loader)
  File "train.py", line 155, in train
    model.train_emb(images, captions, lengths, image_lengths=img_lengths)
  File "/home/s1/ESA-main4/ESA_BERT/lib/vse.py", line 290, in train_emb
    total_gen_loss.backward()
  File "/home/s1/anaconda3/envs/s1_new/lib/python3.8/site-packages/torch/_tensor.py", line 525, in backward
    torch.autograd.backward(
  File "/home/s1/anaconda3/envs/s1_new/lib/python3.8/site-packages/torch/autograd/__init__.py", line 267, in backward
    _engine_run_backward(
  File "/home/s1/anaconda3/envs/s1_new/lib/python3.8/site-packages/torch/autograd/graph.py", line 744, in _engine_run_backward
    return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: Output 0 of TBackward0 is a view and its base or another view of its base has been modified inplace. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one.

请问怎么解决这个报错啊？

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

2条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
阿里嘎多学长 2024-07-10 14:20
关注
以下内容由CHATGPT及阿里嘎多学长共同生成、有用望采纳：

遇到这个 RuntimeError 错误，通常是因为 PyTorch 在执行反向传播时，发现某个张量被原地修改了，而这个张量是一个视图（view），它的基张量或其基张量的其他视图也被修改了。在 PyTorch 中，如果一个张量是另一个张量的视图，那么它们共享相同的内存空间，原地修改会导致问题。

根据你提供的错误信息，问题可能出在以下几处：

原地操作：错误信息提示存在原地操作，你需要检查代码中是否有使用原地操作符，如 +=、-= 等，这些操作可能会修改张量，导致视图问题。

张量视图：错误信息中提到“Output 0 of TBackward0 is a view”，这意味着在某个操作中返回了张量的视图。你需要检查 self.forward_emb, self.img_gen, 和 self.txt_gen 这三个函数，确保它们不会返回视图。

detach() 和 clone()：在进行原地操作之前，使用 .detach() 或 .clone() 来避免这个问题。.detach() 会创建一个新的张量，它与计算图分离，不会影响梯度计算；.clone() 则会创建张量的副本。

梯度裁剪：clip_grad_norm_ 函数调用可能也与此问题有关，需要确保它不会对原张量造成影响。

为了解决这个问题，你可以尝试以下步骤：

审查函数实现：检查 self.forward_emb, self.img_gen, 和 self.txt_gen 这三个函数的实现，确保它们不返回依赖于输入张量的视图。
避免原地操作：如果发现原地操作，考虑替换为非原地操作，或者在操作前使用 .detach() 或 .clone()。
梯度裁剪：检查 clip_grad_norm_ 函数的实现，确保它不会影响张量的视图。

最后，如果你需要进一步的帮助，可以提供更多的代码细节，或者具体的函数实现，这样我可以提供更具体的建议。

目前，我没有找到特定的参考链接，但是这些步骤是根据 PyTorch 的文档和社区经验总结的常见解决方案。如果你需要更详细的解释或示例，可以查看 PyTorch 的官方文档，特别是关于自动微分和梯度计算的部分。
解决无用
评论打赏
分享
举报编辑记录

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

Java对串口进行操作时出现错误 java 开发语言
2022-09-21 16:52

回答 3 已采纳给你找个串口通信的例子串口通信通用工具类 public class SerialPortUtils { private static Logger log = LoggerFactory.
紧急情况：运行时错误：无效的内存地址或在拆分应用程序时取消引用nil指针
2018-03-13 04:20

回答 1 已采纳 You have a var router *gin.Engine inside the InitializeRoutes() function, you don't set router aft
A fatal error has been detected by the Java Runtime Environment:
2010-12-29 17:34

回答 6 已采纳可能性： 1、JDK版本过低，不支持，升级JDK 2、Tomcat有问题，重装 3、查看/mnt/hgfs/Files/NMS/hs_err_pid7052.log日志文件看是什么原因
Python RuntimeError: thread.__init__() not called解决方法
2020-09-22 03:54

在Python编程语言中，多线程是实现并发执行任务的重要工具。`threading.Thread` 是Python标准库中的一个核心模块，用于创建和管理线程。然而，在使用`threading.Thread`进行多线程编程时，可能会遇到`RuntimeError: ...
go-mssql的“无效的内存地址”错误
2015-06-19 18:43

回答 1 已采纳 There is probably an error with db.Query. Check your error and if it is not nil, assume that rows
template.ParseFiles问题
2016-03-29 02:55

回答 1 已采纳 There is no standard method to include static resources for a compiled program; however one common
IOException CreateProcess error=21
2011-05-11 09:06

回答 3 已采纳是不是文件没找到呀？路径有问题
Runtime Error可能产生的原因
2022-02-15 16:22

刘家奕_的博客 runtime error （运行时错误）就是程序运行到一半，程序就崩溃了。如： ①除以零 ②数组越界：int a[3]; a[10000000]=10; ③指针越界：int * p; p=(int *)malloc(5 * sizeof(int)); *(p+1000000)=10; ④使用...
CheckErr（err）函数在Golang中崩溃
2019-02-28 22:02

回答 1 已采纳 CheckErr(err) does not exit or return if err != nil. The program will continue to execute to id, e
GO语言：致命错误：所有goroutine都在睡眠中-死锁
2014-11-14 10:09

回答 2 已采纳 Go program ends when the main function ends. From the language specification Program executio
python中下载tesserocr报错 python
2022-09-11 12:26

回答 2 已采纳给你找了一篇非常好的博客，你可以看看是否有帮助，链接：Python安装tesserocr遇到的坑
C语言中runtime错误,runtime error错误解决方案打开软件出现runtime error
2021-05-17 23:19

hsjdbdb的博客 runtime error怎么解决 runtime error解决方法你的QQ里是不是有个不舍得删除却再也不会联系的人。在C盘建立一个文件夹temp，存放临时文件；右键小编的电脑-属性-高级系统设置-环境变量-系统变量，将TEMP、TMP的值...
如何捕获“切片边界超出范围”错误并为其编写句柄
2019-08-27 14:19

回答 1 已采纳 You need to do bounds checking on your indexes: if j >= 0 && j <= len(str) { y = str[:j]
arcgis运行python出现runtime error_ArcMap运行时出现Runtime Error错误的解决方案
2020-12-24 10:04

EHSer的博客运行ArcMap时弹出错误提示：“Microsoft Visual C++ Runtime Library. Runtime1、开始-》运行-》regsvr32 "C:\Program Files\Common Files\Microsoft Shared\DAO\dao360.dll"2、如果问题仍没有解决，将下面代码复制...
已解决RuntimeError: maximum recursion depth exceeded
2023-03-28 21:01

桃花键神的博客出现 “RuntimeError: maximum recursion depth exceeded” 错误通常是由于递归的深度超过了Python的最大限制导致的。
没有解决我的问题, 去提问

问题事件

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
创建了问题 7月10日

悬赏问题

¥15 c++ gmssl sm2验签demo
¥15 关于模的完全剩余系(关键词-数学方法)
¥15 有没有人懂这个博图程序怎么写，还要跟SFB连接，真的不会，求帮助
¥30 模拟电路 logisim
¥15 PVE8.2.7无法成功使用a5000的vGPU，什么原因
¥15 is not in the mmseg::model registry。报错，模型注册表找不到自定义模块。
¥15 安装quartus II18.1时弹出此error，怎么解决？
¥15 keil官网下载psn序列号在哪
¥15 想用adb命令做一个通话软件，播放录音
¥30 Pytorch深度学习服务器跑不通问题解决？

RuntimeError

2条回答 默认 最新

问题事件

悬赏问题

2条回答默认最新