问题遇到的现象和发生背景
tf复现randla-net,3050显卡
遇到的现象和发生背景,请写出第一个错误信息
用代码块功能插入代码,请勿粘贴截图。 不用代码块回答率下降 50%
运行结果及详细报错内容
(tf1) E:\randlanet\randla_net>python main_remote.py --mode train --gpu 0
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
['E:\\randlanet\\randla_net\\data\\test\\input_0.060\\L002.ply'
'E:\\randlanet\\randla_net\\data\\test\\input_0.060\\L003.ply'
'E:\\randlanet\\randla_net\\data\\test\\input_0.060\\L004.ply']
['E:\\randlanet\\randla_net\\data\\test\\original_ply\\L001.ply']
Load_pc_0: E:\randlanet\randla_net\data\test\input_0.060\L002
Load_pc_1: E:\randlanet\randla_net\data\test\input_0.060\L004
Load_pc_2: E:\randlanet\randla_net\data\test\input_0.060\L003
Load_pc_3: E:\randlanet\randla_net\data\test\original_ply\L001
Preparing reprojection indices for validation and test
finished
Initiating input pipelines
****EPOCH 0****
Traceback (most recent call last):
File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\client\session.py", line 1365, in _do_call
return fn(*args)
File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\client\session.py", line 1350, in _run_fn
target_list, run_metadata)
File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\client\session.py", line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
(0) Resource exhausted: OOM when allocating tensor with shape[1,64,16384,16] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node layers/concat_8}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[optimizer/gradients/layers/Encoder_layer_3mlp1/Conv2D_grad/tuple/control_dependency_1/_979]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
(1) Resource exhausted: OOM when allocating tensor with shape[1,64,16384,16] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node layers/concat_8}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
0 successful operations.
0 derived errors ignored.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main_remote.py", line 344, in <module>
model.train(dataset)
File "E:\randlanet\randla_net\RandLANet.py", line 160, in train
_, _, summary, l_out, probs, labels, acc = self.sess.run(ops, {self.is_training: True})
File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\client\session.py", line 956, in run
run_metadata_ptr)
File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\client\session.py", line 1180, in _run
feed_dict_tensor, options, run_metadata)
File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\client\session.py", line 1359, in _do_run
run_metadata)
File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\client\session.py", line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: 2 root error(s) found.
(0) Resource exhausted: OOM when allocating tensor with shape[1,64,16384,16] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node layers/concat_8 (defined at D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
[[optimizer/gradients/layers/Encoder_layer_3mlp1/Conv2D_grad/tuple/control_dependency_1/_979]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
(1) Resource exhausted: OOM when allocating tensor with shape[1,64,16384,16] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[node layers/concat_8 (defined at D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
0 successful operations.
0 derived errors ignored.
Original stack trace for 'layers/concat_8':
File "main_remote.py", line 343, in <module>
model = Network(dataset, cfg)
File "E:\randlanet\randla_net\RandLANet.py", line 52, in __init__
self.logits = self.inference(self.inputs, self.is_training)
File "E:\randlanet\randla_net\RandLANet.py", line 115, in inference
'Encoder_layer_' + str(i), is_training)
File "E:\randlanet\randla_net\RandLANet.py", line 272, in dilated_res_block
f_pc = self.building_block(xyz, f_pc, neigh_idx, d_out, name + 'LFA', is_training)
File "E:\randlanet\randla_net\RandLANet.py", line 284, in building_block
f_concat = tf.concat([f_neighbours, f_xyz], axis=-1)
File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\util\dispatch.py", line 180, in wrapper
return target(*args, **kwargs)
File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\ops\array_ops.py", line 1420, in concat
return gen_array_ops.concat_v2(values=values, axis=axis, name=name)
File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\ops\gen_array_ops.py", line 1465, in concat_v2
"ConcatV2", values=values, axis=axis, name=name)
File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper
op_def=op_def)
File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 513, in new_func
return func(*args, **kwargs)
File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op
attrs, op_def, compute_device)
File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal
op_def=op_def)
File "D:\Anaconda\envs\tf1\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1748, in __init__
self._traceback = tf_stack.extract_stack()