m0_58313777 2022-10-28 18:20 采纳率: 73.8%
浏览 133
已结题

请问用tensorflow-gpu加速的时候 在训练的时候库好像还没加载完 就开始训练了 请问怎么办?? 这样导致loss好大

请问用tensorflow-gpu加速的时候 在训练的时候库好像还没加载完 就开始训练了 请问怎么办??
这样导致loss好大

2022-10-28 17:54:57.145385: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2022-10-28 17:55:01.362788: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-10-28 17:55:01.363908: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library nvcuda.dll
2022-10-28 17:55:01.379680: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:09:00.0 name: NVIDIA GeForce RTX 3060 computeCapability: 8.6
coreClock: 1.837GHz coreCount: 28 deviceMemorySize: 12.00GiB deviceMemoryBandwidth: 335.32GiB/s
2022-10-28 17:55:01.379824: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2022-10-28 17:55:01.771611: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2022-10-28 17:55:01.771698: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2022-10-28 17:55:02.020106: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2022-10-28 17:55:02.049772: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2022-10-28 17:55:02.060091: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2022-10-28 17:55:02.266527: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2022-10-28 17:55:02.719360: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2022-10-28 17:55:02.719468: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2022-10-28 17:55:03.098776: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-10-28 17:55:03.098858: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267]      0 
2022-10-28 17:55:03.098908: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0:   N 
2022-10-28 17:55:03.099053: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10490 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3060, pci bus id: 0000:09:00.0, compute capability: 8.6)
2022-10-28 17:55:03.099500: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
WARNING:tensorflow:From D:/resnetmemory/NeuralNetwork/test.py:132: The name tf.keras.backend.set_session is deprecated. Please use tf.compat.v1.keras.backend.set_session instead.

2022-10-28 17:55:03.110028: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2022-10-28 17:55:03.110127: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:09:00.0 name: NVIDIA GeForce RTX 3060 computeCapability: 8.6
coreClock: 1.837GHz coreCount: 28 deviceMemorySize: 12.00GiB deviceMemoryBandwidth: 335.32GiB/s
2022-10-28 17:55:03.110274: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2022-10-28 17:55:03.110342: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2022-10-28 17:55:03.110410: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2022-10-28 17:55:03.110480: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2022-10-28 17:55:03.110545: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2022-10-28 17:55:03.110608: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2022-10-28 17:55:03.110674: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2022-10-28 17:55:03.110738: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2022-10-28 17:55:03.110809: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2022-10-28 17:55:03.110976: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:09:00.0 name: NVIDIA GeForce RTX 3060 computeCapability: 8.6
coreClock: 1.837GHz coreCount: 28 deviceMemorySize: 12.00GiB deviceMemoryBandwidth: 335.32GiB/s
2022-10-28 17:55:03.111112: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2022-10-28 17:55:03.111180: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2022-10-28 17:55:03.111251: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2022-10-28 17:55:03.111321: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2022-10-28 17:55:03.111388: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2022-10-28 17:55:03.111455: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2022-10-28 17:55:03.111528: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2022-10-28 17:55:03.111596: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2022-10-28 17:55:03.111669: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2022-10-28 17:55:03.112666: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-10-28 17:55:03.112739: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267]      0 
2022-10-28 17:55:03.112842: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0:   N 
2022-10-28 17:55:03.112959: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10490 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3060, pci bus id: 0000:09:00.0, compute capability: 8.6)
2022-10-28 17:55:03.113099: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2022-10-28 17:55:03.610947: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
Epoch 1/10
2022-10-28 17:55:04.344196: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
  1/150 [..............................] - ETA: 3:09 - loss: 1.39662022-10-28 17:55:04.883306: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2022-10-28 17:55:04.888634: I tensorflow/stream_executor/cuda/cuda_blas.cc:1838] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
150/150 [==============================] - 2s 6ms/step - loss: 0.1363
Epoch 2/10
150/150 [==============================] - 4s 25ms/step - loss: 0.0071
Epoch 3/10
150/150 [==============================] - 4s 25ms/step - loss: 0.0067
Epoch 4/10
150/150 [==============================] - 3s 22ms/step - loss: 0.0065
Epoch 5/10
150/150 [==============================] - 1s 6ms/step - loss: 0.0064
Epoch 6/10
150/150 [==============================] - 4s 24ms/step - loss: 0.0064
Epoch 7/10
150/150 [==============================] - 3s 21ms/step - loss: 0.0062
Epoch 8/10
150/150 [==============================] - 3s 17ms/step - loss: 0.0061
Epoch 9/10
150/150 [==============================] - 1s 4ms/step - loss: 0.0061
Epoch 10/10
150/150 [==============================] - 2s 17ms/step - loss: 0.0060
[0.007032716181129217, 0.006602869369089603, 0.006445118226110935, 0.006309151649475098, 0.006184305530041456, 0.006077173165977001, 0.005976531654596329]

进程已结束,退出代码0


img

  • 写回答

1条回答 默认 最新

  • 万里鹏程转瞬至 人工智能领域优质创作者 2022-10-29 17:22
    关注

    你理解错了,训练的时候其实库已经加载完了。只是输出信息在缓冲区并没有及时输出到屏幕,你可以设置以下tflog信息的输出级别,I级别的调试信息不用输出。
    模型一开始loss大是正常的,后面训练会慢慢降低的

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

问题事件

  • 系统已结题 11月6日
  • 已采纳回答 10月29日
  • 创建了问题 10月28日

悬赏问题

  • ¥15 运筹学中在线排序的时间在线排序的在线LPT算法
  • ¥30 求一段fortran代码用IVF编译运行的结果
  • ¥15 深度学习根据CNN网络模型,搭建BP模型并训练MNIST数据集
  • ¥15 lammps拉伸应力应变曲线分析
  • ¥15 C++ 头文件/宏冲突问题解决
  • ¥15 用comsol模拟大气湍流通过底部加热(温度不同)的腔体
  • ¥50 安卓adb backup备份子用户应用数据失败
  • ¥20 有人能用聚类分析帮我分析一下文本内容嘛
  • ¥15 请问Lammps做复合材料拉伸模拟,应力应变曲线问题
  • ¥30 python代码,帮调试,帮帮忙吧