weixin_43589475 2020-04-28 10:03 采纳率: 50%
浏览 436
已采纳

tensorflow-gpu进行3DUnet训练,jupyter出现服务重启?

我使用了tensorflow1.4.0+CUDA8.0+cudnn6.0进行深度学习的训练,当训练进行到第一个epoch结束的时候就会出现jupyter服务重启的问题,按照之前的博主限制了显卡的占用率,也还是没有效果,查了一下nvidia-smi,显示显卡也有正常调用,很困惑,明明安装了CUDA,版本也应该是正确的,求各位大佬解答。
限制显卡占用的代码

import keras.backend.tensorflow_backend as ktf
import tensorflow as tf
import os
os.environ['CUDA_VISIBLE_DEVICES']='0'
Conf = tf.ConfigProto()
Conf.gpu_options.per_process_gpu_memory_fraction = 0.5
Conf.gpu_options.allow_growth = True
sess = tf.Session(config = Conf)
ktf.set_session(sess)

查询nvidia-smi的显示
图片说明

运行一个epoch后的显示
图片说明
图片说明

以下是错误信息

Exception in thread Thread-6:
Traceback (most recent call last):
  File "e:\anaconda3\envs\tensorflow\lib\threading.py", line 916, in _bootstrap_inner
    self.run()
  File "e:\anaconda3\envs\tensorflow\lib\threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "e:\anaconda3\envs\tensorflow\lib\site-packages\keras\utils\data_utils.py", line 568, in data_generator_task
    generator_output = next(self._generator)
  File "E:\Jupyter\3DUnetCNN\unet3d\generator.py", line 155, in data_generator
    skip_blank=skip_blank, permute=permute)
  File "E:\Jupyter\3DUnetCNN\unet3d\generator.py", line 210, in add_data
    data, truth = get_data_from_file(data_file, index, patch_shape=patch_shape)
  File "E:\Jupyter\3DUnetCNN\unet3d\generator.py", line 234, in get_data_from_file
    data, truth = get_data_from_file(data_file, index, patch_shape=None)
  File "E:\Jupyter\3DUnetCNN\unet3d\generator.py", line 238, in get_data_from_file
    x, y = data_file.root.data[index], data_file.root.truth[index, 0]
  File "e:\anaconda3\envs\tensorflow\lib\site-packages\tables\array.py", line 658, in __getitem__
    arr = self._read_slice(startl, stopl, stepl, shape)
  File "e:\anaconda3\envs\tensorflow\lib\site-packages\tables\array.py", line 762, in _read_slice
    self._g_read_slice(startl, stopl, stepl, nparr)
  File "tables\hdf5extension.pyx", line 1585, in tables.hdf5extension.Array._g_read_slice
tables.exceptions.HDF5ExtError: HDF5 error back trace

  File "D:\pytables_hdf5\CMake-hdf5-1.10.5\hdf5-1.10.5\src\H5Dio.c", line 199, in H5Dread
    can't read data
  File "D:\pytables_hdf5\CMake-hdf5-1.10.5\hdf5-1.10.5\src\H5Dio.c", line 601, in H5D__read
    can't read data
  File "D:\pytables_hdf5\CMake-hdf5-1.10.5\hdf5-1.10.5\src\H5Dchunk.c", line 2282, in H5D__chunk_read
    chunked read failed
  File "D:\pytables_hdf5\CMake-hdf5-1.10.5\hdf5-1.10.5\src\H5Dselect.c", line 283, in H5D__select_read
    read error
  File "D:\pytables_hdf5\CMake-hdf5-1.10.5\hdf5-1.10.5\src\H5Dselect.c", line 118, in H5D__select_io
    can't retrieve I/O vector size
  File "D:\pytables_hdf5\CMake-hdf5-1.10.5\hdf5-1.10.5\src\H5CX.c", line 1341, in H5CX_get_vec_size
    can't get default dataset transfer property list

End of HDF5 error back trace

Problems reading the array data.
  • 写回答

1条回答 默认 最新

  • threenewbee 2020-04-28 15:54
    关注

    笔记本的显示卡散热不行,显存也小,所以不稳定。建议你找桌面GTX1060/1660以上的卡来测试。

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 metadata提取的PDF元数据,如何转换为一个Excel
  • ¥15 关于arduino编程toCharArray()函数的使用
  • ¥100 vc++混合CEF采用CLR方式编译报错
  • ¥15 coze 的插件输入飞书多维表格 app_token 后一直显示错误,如何解决?
  • ¥15 vite+vue3+plyr播放本地public文件夹下视频无法加载
  • ¥15 c#逐行读取txt文本,但是每一行里面数据之间空格数量不同
  • ¥50 如何openEuler 22.03上安装配置drbd
  • ¥20 ING91680C BLE5.3 芯片怎么实现串口收发数据
  • ¥15 无线连接树莓派,无法执行update,如何解决?(相关搜索:软件下载)
  • ¥15 Windows11, backspace, enter, space键失灵