请问用tensorflow-gpu加速的时候在训练的时候库好像还没加载完就开始训练了请问怎么办？？这样导致loss好大

请问用tensorflow-gpu加速的时候在训练的时候库好像还没加载完就开始训练了请问怎么办？？
这样导致loss好大

2022-10-28 17:54:57.145385: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2022-10-28 17:55:01.362788: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-10-28 17:55:01.363908: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library nvcuda.dll
2022-10-28 17:55:01.379680: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:09:00.0 name: NVIDIA GeForce RTX 3060 computeCapability: 8.6
coreClock: 1.837GHz coreCount: 28 deviceMemorySize: 12.00GiB deviceMemoryBandwidth: 335.32GiB/s
2022-10-28 17:55:01.379824: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2022-10-28 17:55:01.771611: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2022-10-28 17:55:01.771698: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2022-10-28 17:55:02.020106: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2022-10-28 17:55:02.049772: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2022-10-28 17:55:02.060091: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2022-10-28 17:55:02.266527: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2022-10-28 17:55:02.719360: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2022-10-28 17:55:02.719468: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2022-10-28 17:55:03.098776: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-10-28 17:55:03.098858: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267]      0 
2022-10-28 17:55:03.098908: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0:   N 
2022-10-28 17:55:03.099053: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10490 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3060, pci bus id: 0000:09:00.0, compute capability: 8.6)
2022-10-28 17:55:03.099500: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
WARNING:tensorflow:From D:/resnetmemory/NeuralNetwork/test.py:132: The name tf.keras.backend.set_session is deprecated. Please use tf.compat.v1.keras.backend.set_session instead.

2022-10-28 17:55:03.110028: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2022-10-28 17:55:03.110127: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:09:00.0 name: NVIDIA GeForce RTX 3060 computeCapability: 8.6
coreClock: 1.837GHz coreCount: 28 deviceMemorySize: 12.00GiB deviceMemoryBandwidth: 335.32GiB/s
2022-10-28 17:55:03.110274: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2022-10-28 17:55:03.110342: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2022-10-28 17:55:03.110410: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2022-10-28 17:55:03.110480: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2022-10-28 17:55:03.110545: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2022-10-28 17:55:03.110608: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2022-10-28 17:55:03.110674: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2022-10-28 17:55:03.110738: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2022-10-28 17:55:03.110809: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2022-10-28 17:55:03.110976: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:09:00.0 name: NVIDIA GeForce RTX 3060 computeCapability: 8.6
coreClock: 1.837GHz coreCount: 28 deviceMemorySize: 12.00GiB deviceMemoryBandwidth: 335.32GiB/s
2022-10-28 17:55:03.111112: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2022-10-28 17:55:03.111180: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2022-10-28 17:55:03.111251: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2022-10-28 17:55:03.111321: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2022-10-28 17:55:03.111388: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2022-10-28 17:55:03.111455: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2022-10-28 17:55:03.111528: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2022-10-28 17:55:03.111596: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2022-10-28 17:55:03.111669: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2022-10-28 17:55:03.112666: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-10-28 17:55:03.112739: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267]      0 
2022-10-28 17:55:03.112842: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0:   N 
2022-10-28 17:55:03.112959: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10490 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3060, pci bus id: 0000:09:00.0, compute capability: 8.6)
2022-10-28 17:55:03.113099: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2022-10-28 17:55:03.610947: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
Epoch 1/10
2022-10-28 17:55:04.344196: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
  1/150 [..............................] - ETA: 3:09 - loss: 1.39662022-10-28 17:55:04.883306: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2022-10-28 17:55:04.888634: I tensorflow/stream_executor/cuda/cuda_blas.cc:1838] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
150/150 [==============================] - 2s 6ms/step - loss: 0.1363
Epoch 2/10
150/150 [==============================] - 4s 25ms/step - loss: 0.0071
Epoch 3/10
150/150 [==============================] - 4s 25ms/step - loss: 0.0067
Epoch 4/10
150/150 [==============================] - 3s 22ms/step - loss: 0.0065
Epoch 5/10
150/150 [==============================] - 1s 6ms/step - loss: 0.0064
Epoch 6/10
150/150 [==============================] - 4s 24ms/step - loss: 0.0064
Epoch 7/10
150/150 [==============================] - 3s 21ms/step - loss: 0.0062
Epoch 8/10
150/150 [==============================] - 3s 17ms/step - loss: 0.0061
Epoch 9/10
150/150 [==============================] - 1s 4ms/step - loss: 0.0061
Epoch 10/10
150/150 [==============================] - 2s 17ms/step - loss: 0.0060
[0.007032716181129217, 0.006602869369089603, 0.006445118226110935, 0.006309151649475098, 0.006184305530041456, 0.006077173165977001, 0.005976531654596329]

进程已结束,退出代码0

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
万里鹏程转瞬至人工智能领域优质创作者 2022-10-29 17:22
关注
你理解错了，训练的时候其实库已经加载完了。只是输出信息在缓冲区并没有及时输出到屏幕，你可以设置以下tflog信息的输出级别，I级别的调试信息不用输出。
模型一开始loss大是正常的，后面训练会慢慢降低的

本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决 1
无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

请问用tensorflow-gpu加速的时候在训练的时候库好像还没加载完就开始训练了请问怎么办？？这样导致loss好大 tensorflow 深度学习神经网络
2022-10-28 18:20

回答 1 已采纳你理解错了，训练的时候其实库已经加载完了。只是输出信息在缓冲区并没有及时输出到屏幕，你可以设置以下tflog信息的输出级别，I级别的调试信息不用输出。模型一开始loss大是正常的，后面训练会慢慢降低的
使用tensorflow-gpu无法训练模型？ python tensorflow ubuntu
2021-08-09 15:02

回答 2 已采纳看你的运行结果是你的cuda版本过于低，gpu的性能没有完全利用，观察你的loss几乎没有变化，loss要下降趋势才是正常运行结果，调整一下超参数试试
tensorflow-gpu运行神经网络时一训练内核就失联 python tensorflow
2023-03-10 22:45

回答 4 已采纳参考GPT和自己的思路：出现内核崩溃的原因可能有很多，以下是一些可能的解决方案： 1 确保你的TensorFlow-GPU版本与CUDA和cuDNN版本兼容。可以参考TensorFlow官方网站提供
卸载tensorflow-cpu重装tensorflow-gpu操作
2020-12-17 08:10

在IT领域，特别是深度学习和机器学习的实践中，经常需要在不同的硬件环境下切换TensorFlow的版本，例如从CPU版本升级到GPU版本。这个过程可能会遇到一些挑战，如错误和依赖问题。以下是一份详细的指南，解释如何卸载...
tensorflow-gpu进行3DUnet训练，jupyter出现服务重启？ tensorflow 人工智能机器学习深度学习神经网络
2020-04-28 10:03

回答 1 已采纳笔记本的显示卡散热不行，显存也小，所以不稳定。建议你找桌面GTX1060/1660以上的卡来测试。
关于#TensorFlow-GPU#的问题，运行结果中的提示信息如何解读？ python tensorflow 机器学习
2022-02-25 14:58

回答 1 已采纳一般只要不是E或者error报错，可以都不管，是tf输出的一些日志信息
anaconda安装tensorflow-gpu tensorflow 深度学习
2022-01-24 22:16

回答 2 已采纳你都没有安装成功，网络下载失败了首先安装这个换下镜像源 conda和pip如何切换为清华镜像源 - 简书一、conda切换为清华镜像源提
tensorflow使用gpu进行训练
2021-08-30 10:30

甜辣uu的博客 GPU：本机中的GPU编号（有多块显卡的时候，从0开始编号）图上GPU的编号是：0 Fan：风扇转速（0%-100%），N/A表示没有风扇 Name：GPU类型，图上GPU的类型是：Tesla T4 Temp：GPU的温度（GPU温度过高...
tensorflow-gpu为何无法调用GPU进行运算??? tensorflow
2018-11-08 07:52

回答 3 已采纳 GT730有好几种，一个是GT640的马甲卡，采用Kepler核心，384 CUDA Cores，叫做GK208/GK107，反正有个K，这个支持CUDA 3.0，可以跑TF（但是相当慢）还有一个
tensorflow-gpu为何无法调用GPU进行运算？ tensorflow
2018-11-08 07:30

回答 5 已采纳在 https://ask.csdn.net/questions/710166 回答你了，如果满意，请点我回答左上角的箭头和采纳。谢谢
TensorFlow-gpu安装问题（提示正在寻找匹配的pip版本） python 机器学习深度学习
2021-05-27 20:02

回答 2 已采纳他在找匹配版本。这里建议检查一下tensorflow-gpu和你的cuda 还有cudnn是否都匹配，去官网查一下，是否兼容。大概率是这里的问题，然后删除旧版本统一后重新安装就行了。
Macbook M1安装tensorflow-gpu教程
2021-11-27 19:56

Joemt的博客 import tensorflow as tf print(tf.__version__) #加载mnist数据集 mnist = tf.keras.datasets.mnist (x_train, y_train), (x_test, y_test) = mnist.load_data() model = tf.keras.models.Sequential([ tf.keras...
tensorflow-gpu调用代码 python tensorflow 深度学习
2023-03-01 20:10

回答 2 已采纳最新版本的tensorflow就是默认是GPU训练的，如果你电脑上有合适的GPU他就默认开启GPU，如果没有GPU他就自动切到CPU，代码是一样的，不需要特别针对GPU进行开发相应的代码。
Keras深度学习框架第十四讲：使用TensorFlow进行多GPU分布式训练
2024-05-19 08:28

MUKAMO的博客在进行TensorFlow分布式训练时，优化数据加载是关键。使用tf.data.Dataset对象加载数据，确保数据按全局批量大小进行批处理。使用dataset.cache()将数据缓存至内存或本地文件以加快迭代速度。dataset.prefetch...
tensorflow 多gpu 并行训练
2022-10-19 10:21

刘梓枫的博客 tensorflow 多GPU训练
没有解决我的问题, 去提问

问题事件

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
系统已结题 11月6日
关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
已采纳回答 10月29日
关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
创建了问题 10月28日

悬赏问题

¥15 metadata提取的PDF元数据，如何转换为一个Excel
¥15 关于arduino编程toCharArray()函数的使用
¥100 vc++混合CEF采用CLR方式编译报错
¥15 coze 的插件输入飞书多维表格 app_token 后一直显示错误，如何解决？
¥15 vite+vue3+plyr播放本地public文件夹下视频无法加载
¥15 c#逐行读取txt文本，但是每一行里面数据之间空格数量不同
¥50 如何openEuler 22.03上安装配置drbd
¥20 ING91680C BLE5.3 芯片怎么实现串口收发数据
¥15 无线连接树莓派，无法执行update，如何解决？（相关搜索：软件下载）
¥15 Windows11, backspace, enter, space键失灵

请问用tensorflow-gpu加速的时候 在训练的时候库好像还没加载完 就开始训练了 请问怎么办？？ 这样导致loss好大

1条回答 默认 最新

问题事件

悬赏问题

请问用tensorflow-gpu加速的时候在训练的时候库好像还没加载完就开始训练了请问怎么办？？这样导致loss好大

1条回答默认最新