运行tensorflow时出现tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed这个错误

运行tensorflow时出现tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed这个错误，查了一下说是gpu被占用了，从下面这里开始出问题的：

2019-10-17 09:28:49.495166: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6382 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:01:00.0, compute capability: 6.1)
(60000, 28, 28) (60000, 10)
2019-10-17 09:28:51.275415: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cublas64_100.dll'; dlerror: cublas64_100.dll not found

图片说明

最后显示的问题：

图片说明
试了一下网上的方法，比如加代码：

gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))

但最后提示：

图片说明

现在不知道要怎么解决了。新手想试下简单的数字识别，步骤也是按教程一步步来的，可能用的版本和教程不一样，我用的是刚下的：2.0tensorflow和以下：

图片说明

不知道会不会有版本问题，现在紧急求助各位大佬，还有没有其它可以尝试的方法。测试程序加法运算可以执行，数字识别图片运行的时候我看了下，GPU最大占有率才0.2%，下面是完整数字图片识别代码：

import os
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, optimizers, datasets

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

#gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.2)
#sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))

(x, y), (x_val, y_val) = datasets.mnist.load_data()
x = tf.convert_to_tensor(x, dtype=tf.float32) / 255.
y = tf.convert_to_tensor(y, dtype=tf.int32)
y = tf.one_hot(y, depth=10)
print(x.shape, y.shape)
train_dataset = tf.data.Dataset.from_tensor_slices((x, y))
train_dataset = train_dataset.batch(200)

model = keras.Sequential([
    layers.Dense(512, activation='relu'),
    layers.Dense(256, activation='relu'),
    layers.Dense(10)])

optimizer = optimizers.SGD(learning_rate=0.001)


def train_epoch(epoch):
    # Step4.loop
    for step, (x, y) in enumerate(train_dataset):

        with tf.GradientTape() as tape:
            # [b, 28, 28] => [b, 784]
            x = tf.reshape(x, (-1, 28 * 28))
            # Step1. compute output
            # [b, 784] => [b, 10]
            out = model(x)
            # Step2. compute loss
            loss = tf.reduce_sum(tf.square(out - y)) / x.shape[0]

        # Step3. optimize and update w1, w2, w3, b1, b2, b3
        grads = tape.gradient(loss, model.trainable_variables)
        # w' = w - lr * grad
        optimizer.apply_gradients(zip(grads, model.trainable_variables))

        if step % 100 == 0:
            print(epoch, step, 'loss:', loss.numpy())


def train():
    for epoch in range(30):
        train_epoch(epoch)


if __name__ == '__main__':
    train()

希望能有人给下建议或解决方法，拜谢！

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

3条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
plus_left 2022-04-07 19:46
关注
请问这个问题解决了吗

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

tensorflow.python.framework.errors_impl.InternalError: CUDA runtime implicit initialization on GPU:0 failed. Status: out of memory git python tensorflow 机器学习深度学习
2020-09-10 13:59

回答 2 已采纳感觉是显存爆了，把你的batch size搞小一些训练，再不行，简化下模型。
tensorflow出现这种错误是怎么回事？ python tensorflow 有问必答计算机视觉
2021-05-24 15:43

回答 2 已采纳因为显存不够，降低batchsize即可。参考(1条消息) tensorflow训练3dcnn报错：NotFoundError: No algorithm worked!_今天又是不求上进的一天的
用tensorflow做训练os.environ['CUDA_VISIBLE_DEVICES'] = '/gpu:0' 无法调用gpu执行 tensorflow 人工智能深度学习
2021-09-05 22:51

回答 1 已采纳 os.environ['CUDA_VISIBLE_DEVICES'] = '0' 你就一张显卡，那肯定是写个0就可以了啊，也就是默认编号为0的显卡，你指定1，2，3的话你本身又没有多显卡，那只能
tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed
2022-04-14 20:21

jxx29wendken的博客此错误主要是GPU的可用内存不足引起的错误，解决方法如下： import tensorflow as tf import os os.environ["CUDA_VISIBLE_DEVICES"] = '0' #或者'1' 调用运行GPU的编号 # 定义TensorFlow配置 config = tf....
使用tensorflow时在 '__init__.py' 中找不到引用 python tensorflow 机器学习
2021-05-03 17:32

回答 1 已采纳我把你的代码拿到本地跑了，首先你的代码的API是TF1.x版本的，我本地用的是tf1.15，所以第一步是把TF切换到1.15(我测试通过了，看你用Anaconda，那么安装就很简单了conda ins
tensorflow 训练完后如何测试？尝试读取文件，报错了。 python tensorflow 深度学习
2022-08-01 00:41

回答 1 已采纳 Set up your data format vector and pass it into the Model for inference
python 'if __name__ == "__main__":' 错误，直接执行测试（Terminal 运行正常） python tensorflow 机器学习
2019-06-18 01:35

回答 4 已采纳已解决。不应该在main函数中加入test内容。pycharm会直接进行测试内容，跳过赋值部分。termianly运行不会出现类似问题。解决方法就是将测试函数重新放在一个新的文件里。在main函数
【tensorflow报错】tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed：XXX
2021-12-16 14:09

人工智能程序源的博客 tensorflow报错解决
tensorflow断点续训报错 keras python tensorflow
2022-06-16 15:29

回答 3 已采纳执行model.load_weights(filepath)后，filepath这个文件会被占用，无法删除或重命名，导致后面的回调函数ModelCheckpoint(filepath)无法自动保存权重
python中出现错误 ValueError: Series.replace cannot use dict-value and non-None to_replace 如何解决？ python 人工智能有问必答机器学习
2021-11-10 11:12

回答 1 已采纳报错很清楚了，告诉你不能把空替换成数据你可以把任何匹配到的字符替换成空，但是不能倒过来没法把空替换成数据
为什么在pycharm里成功下载TensorFlow模块，却在import时报错
2021-01-12 22:45

回答 2 已采纳这是DLL load failed，就是DLL加载失败。分析原因可能是安装TF的时候没有安装完全。建议可以使用conda方法来组织Python环境。具体步骤可以仿照我的手把手博客内容：《『带你学A
【2022年】解决tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed
2022-04-24 23:15

newbie,,,的博客在网上搜了半天，大部分是说GPU被占用，或者是向量维数不正确什么的。但是我的问题并不在这里。解决方式：加入Tensorflow显存设置。 import cv2 import tensorflow.co
eclipse 创建maven项目提示An internal error occurred during: "Creating maven-archetype-quickstart". Guice provision errors: eclipse maven spring
2020-04-24 11:47

回答 1 已采纳 https://blog.csdn.net/as763190097/article/details/50339703?utm_source=blogxgwz6
tensorflow.python.framework.errors_impl.InternalError: Blas xGEMM launch failed
2021-11-04 15:34

小宝学技术的博客 tensorflow.python.framework.errors_impl.InternalError: Blas xGEMM launch failed : a.shape=[1,480000,64], b.shape=[1,480000,64], m=64, n=64, k=480000 [Op:Einsum] 查阅资料找到了以下两种解决方案： 1.在...
【已解决】“tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed“
2021-05-23 17:17

不爱喝牛奶的哈士奇的博客错：“tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed” 提示：查阅相关资料，怀疑是tensorflow的版本问题当前配置如下： tensorflow-gpu:1.15.0 cuda:10.0 cudnn:7.6.5 python...
tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed问题解决思路之一
2019-12-09 19:39

Fire_dadada~的博客运行tensorflow时出现tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed这个错误运行tensorflow时出现tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch ...
【tensorflow.python.framework.errors_impl.InternalError: Blas SGEMM launch failed】错误解决方案
2020-11-16 21:50

望天边星宿的博客 E tensorflow/stream_executor/cuda/cuda_blas.cc:652] failed to run cuBLAS routine cublasSgemm_v2: CUBLAS_STATUS_EXECUTION_FAILED Traceback (most recent call last): File "E:/Project/keras-yolo3-person&...
tensorflow报错:tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed :
2020-01-09 13:28

尚墨1111的博客 #在tensorflow 2.0 里面，要想一个高阶迭代多次调用tf.GradientTape()时因为tape是一次性的，算完就会释放，所以要想重复调用必须设置persistent=’True‘，但是如果忘记了释放就会导致GPU被占用 w = tf.constant(1....
解决 tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed
2019-03-16 14:43

Jaichg的博客 tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(1, 10), b.shape=(10, 2), m=10, n=2, k=10 [Op:MatMul] 原因： GPU被占用。tensorflow sess = tf.Sessio...
没有解决我的问题, 去提问

悬赏问题

¥15 三菱伺服电机按启动按钮有使能但不动作
¥20 为什么我写出来的绘图程序是这样的，有没有lao哥改一下
¥15 js，页面2返回页面1时定位进入的设备
¥200 关于#c++#的问题，请各位专家解答！网站的邀请码
¥50 导入文件到网吧的电脑并且在重启之后不会被恢复
¥15 （希望可以解决问题）ma和mb文件无法正常打开，打开后是空白，但是有正常内存占用，但可以在打开Maya应用程序后打开场景ma和mb格式。
¥20 ML307A在使用AT命令连接EMQX平台的MQTT时被拒绝
¥20 腾讯企业邮箱邮件可以恢复么
¥15 有人知道怎么将自己的迁移策略布到edgecloudsim上使用吗？
¥15 错误 LNK2001 无法解析的外部符号

运行tensorflow时出现tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed这个错误

3条回答 默认 最新

悬赏问题

3条回答默认最新