逆风跑的人 2020-12-30 16:35 采纳率: 0%
浏览 777

torch中loss.backward()报错,frame #0 CUDA error

Traceback (most recent call last):
  File "SH_main.py", line 229, in <module>
    loss.backward()
  File "/home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/tensor.py", line 198, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/autograd/__init__.py", line 100, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: CUDA error: an illegal memory access was encountered (copy_kernel_cuda at /opt/conda/conda-bld/pytorch_1587428266983/work/aten/src/ATen/native/cuda/Copy.cu:180)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x4e (0x7fa10d398b5e in /home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x240024f (0x7fa10f9c524f in /home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/lib/libtorch_cuda.so)
frame #2: <unknown function> + 0x9146ac (0x7fa138a9e6ac in /home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #3: <unknown function> + 0x911d73 (0x7fa138a9bd73 in /home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #4: at::native::copy_(at::Tensor&, at::Tensor const&, bool) + 0x44 (0x7fa138a9d834 in /home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #5: <unknown function> + 0x2ecd25d (0x7fa13b05725d in /home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #6: <unknown function> + 0xb73b43 (0x7fa138cfdb43 in /home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #7: at::native::to(at::Tensor const&, c10::TensorOptions const&, bool, bool, c10::optional<c10::MemoryFormat>) + 0x6a0 (0x7fa138cfe690 in /home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #8: <unknown function> + 0xe92d2a (0x7fa13901cd2a in /home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #9: <unknown function> + 0x291074e (0x7fa13aa9a74e in /home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #10: <unknown function> + 0xdd4282 (0x7fa138f5e282 in /home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #11: at::Tensor::to(c10::TensorOptions const&, bool, bool, c10::optional<c10::MemoryFormat>) const + 0x15c (0x7fa13e3807bc in /home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #12: torch::autograd::CopyBackwards::apply(std::vector<at::Tensor, std::allocator<at::Tensor> >&&) + 0x47e (0x7fa13ac7c4ce in /home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #13: <unknown function> + 0x2ae8215 (0x7fa13ac72215 in /home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #14: torch::autograd::Engine::evaluate_function(std::shared_ptr<torch::autograd::GraphTask>&, torch::autograd::Node*, torch::autograd::InputBuffer&) + 0x16f3 (0x7fa13ac6f513 in /home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #15: torch::autograd::Engine::thread_main(std::shared_ptr<torch::autograd::GraphTask> const&, bool) + 0x3d2 (0x7fa13ac702f2 in /home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #16: torch::autograd::Engine::thread_init(int) + 0x39 (0x7fa13ac68969 in /home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so)
frame #17: torch::autograd::python::PythonEngine::thread_init(int) + 0x38 (0x7fa13dfaf558 in /home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #18: <unknown function> + 0xc819d (0x7fa140a0119d in /home/ydd/anaconda3/envs/py3/lib/python3.7/site-packages/torch/lib/../../../.././libstdc++.so.6)
frame #19: <unknown function> + 0x7e25 (0x7fa160761e25 in /lib64/libpthread.so.0)
frame #20: clone + 0x6d (0x7fa16048f34d in /lib64/libc.so.6)
 

  • 写回答

3条回答 默认 最新

  • 逆风跑的人 2020-12-30 16:46
    关注

    请问这是什么原因呢,模型以及模型需要的外部参数都已经放在了GPU上了

    GPU也够用

    评论

报告相同问题?

悬赏问题

  • ¥15 关于#游戏策划#的问题:当浏览器输入兑换码,疯狂点击领取按钮,邮箱马上到账几十个兑换码礼包
  • ¥15 虚拟机打不开,怎么解决
  • ¥15 为什么游戏兑换码能被重复领取
  • ¥30 (急!)java实现二叉链表构建二叉树,实现相关功能
  • ¥15 C#tekloa节点插件小项
  • ¥20 脑电信号的局部场电位分析
  • ¥30 Diodes 霍尔开关AH337已经obselete,他的升级替代料【不改变现有电路图】
  • ¥15 python爬虫IndexError: list index out of range
  • ¥15 (标签-考研|关键词-set)
  • ¥15 求修改代码,图书管理系统