weixin_43571870 2023-01-07 11:18 采纳率: 0%
浏览 44
已结题

tensorflow,randla-net爆内存了

问题遇到的现象和发生背景

tf复现randla-net,3050显卡

遇到的现象和发生背景,请写出第一个错误信息
用代码块功能插入代码,请勿粘贴截图。 不用代码块回答率下降 50%
运行结果及详细报错内容
(tf1) E:\randlanet\randla_net>python main_remote.py --mode train --gpu 0
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
['E:\\randlanet\\randla_net\\data\\test\\input_0.060\\L002.ply'
 'E:\\randlanet\\randla_net\\data\\test\\input_0.060\\L003.ply'
 'E:\\randlanet\\randla_net\\data\\test\\input_0.060\\L004.ply']
['E:\\randlanet\\randla_net\\data\\test\\original_ply\\L001.ply']
Load_pc_0: E:\randlanet\randla_net\data\test\input_0.060\L002
Load_pc_1: E:\randlanet\randla_net\data\test\input_0.060\L004
Load_pc_2: E:\randlanet\randla_net\data\test\input_0.060\L003
Load_pc_3: E:\randlanet\randla_net\data\test\original_ply\L001

Preparing reprojection indices for validation and test
finished
Initiating input pipelines
****EPOCH 0****
Traceback (most recent call last):
  File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\client\session.py", line 1365, in _do_call
    return fn(*args)
  File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\client\session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\client\session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
  (0) Resource exhausted: OOM when allocating tensor with shape[1,64,16384,16] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[{{node layers/concat_8}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

         [[optimizer/gradients/layers/Encoder_layer_3mlp1/Conv2D_grad/tuple/control_dependency_1/_979]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

  (1) Resource exhausted: OOM when allocating tensor with shape[1,64,16384,16] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[{{node layers/concat_8}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main_remote.py", line 344, in <module>
    model.train(dataset)
  File "E:\randlanet\randla_net\RandLANet.py", line 160, in train
    _, _, summary, l_out, probs, labels, acc = self.sess.run(ops, {self.is_training: True})
  File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\client\session.py", line 956, in run
    run_metadata_ptr)
  File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\client\session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\client\session.py", line 1359, in _do_run
    run_metadata)
  File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\client\session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
  (0) Resource exhausted: OOM when allocating tensor with shape[1,64,16384,16] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[node layers/concat_8 (defined at D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

         [[optimizer/gradients/layers/Encoder_layer_3mlp1/Conv2D_grad/tuple/control_dependency_1/_979]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

  (1) Resource exhausted: OOM when allocating tensor with shape[1,64,16384,16] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
         [[node layers/concat_8 (defined at D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

0 successful operations.
0 derived errors ignored.

Original stack trace for 'layers/concat_8':
  File "main_remote.py", line 343, in <module>
    model = Network(dataset, cfg)
  File "E:\randlanet\randla_net\RandLANet.py", line 52, in __init__
    self.logits = self.inference(self.inputs, self.is_training)
  File "E:\randlanet\randla_net\RandLANet.py", line 115, in inference
    'Encoder_layer_' + str(i), is_training)
  File "E:\randlanet\randla_net\RandLANet.py", line 272, in dilated_res_block
    f_pc = self.building_block(xyz, f_pc, neigh_idx, d_out, name + 'LFA', is_training)
  File "E:\randlanet\randla_net\RandLANet.py", line 284, in building_block
    f_concat = tf.concat([f_neighbours, f_xyz], axis=-1)
  File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\util\dispatch.py", line 180, in wrapper
    return target(*args, **kwargs)
  File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\ops\array_ops.py", line 1420, in concat
    return gen_array_ops.concat_v2(values=values, axis=axis, name=name)
  File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\ops\gen_array_ops.py", line 1465, in concat_v2
    "ConcatV2", values=values, axis=axis, name=name)
  File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 513, in new_func
    return func(*args, **kwargs)
  File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()

我的解答思路和尝试过的方法,不写自己思路的,回答率下降 60

cuda安装参考 https://blog.csdn.net/FortuneLegend/article/details/125958288?ops_request_misc=&request_id=&biz_id=102&utm_term=rtx3050%E5%AE%89%E8%A3%85tensorflow-gpu1.13&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduweb~default-2-125958288.142^v68^control,201^v4^add_ask,213^v2^t3_control2&spm=1018.2226.3001.4187

我想要达到的结果,如果你需要快速回答,请尝试 “付费悬赏”
  • 写回答

0条回答 默认 最新

    报告相同问题?

    问题事件

    • 系统已结题 1月15日
    • 创建了问题 1月7日

    悬赏问题

    • ¥15 android 蓝牙闪退
    • ¥15 绝缘子污秽comsol仿真参数
    • ¥15 Fatal error in Process MEMORY
    • ¥15 labelme生成的json有乱码?
    • ¥30 arduino vector defined in discarded section `.text' of wiring.c.o (symbol from plugin)
    • ¥20 如何训练大模型在复杂因素组成的系统中求得最优解
    • ¥15 关于#r语言#的问题:在进行倾向性评分匹配时,使用“match it"包提示”错误于eval(family$initialize): y值必需满足0 <= y <= 1“请问在进行PSM时
    • ¥45 求17位带符号原码乘法器verilog代码
    • ¥20 PySide6扩展QLable实现Word一样的图片裁剪框
    • ¥15 怎样才能让IDEA不爆红