基于tensorflow的模型做预测时，cpu占用率过高，如何降低程序的cpu占用率？

如图，在cpu服务器，liunux系统下跑，基于tensorflow的模型做预测时，cpu占用率过高，最多达到了700%，严重影响服务器上其他程序的运行，请问可以怎么改程序，降低程序的cpu占用率？

 def test(self, height, width, input_path, output_path,checkpoint_path):
        imgsName = sorted(os.listdir(input_path))#遍历文件夹中的所有图像
        H, W = height, width
        inp_chns = 3 if self.args.model == 'color' else 1
        self.batch_size = 1 if self.args.model == 'color' else 1
        model_name = "deblur.model"
        ckpt_name = model_name + '-' + '15000'
        tf.reset_default_graph()
        graph = tf.get_default_graph() 
        inputs = tf.placeholder(shape=[self.batch_size, H, W, inp_chns], dtype=tf.float32) #输入占位符
        outputs = self.generator(inputs, reuse=False)#建立计算图
        saver = tf.train.Saver(tf.global_variables(), max_to_keep=2)
        sess=tf.Session(graph=graph,config=tf.ConfigProto(device_count={"CPU": 1},allow_soft_placement=True,inter_op_parallelism_threads=1,intra_op_parallelism_threads=1,use_per_session_threads=True))#设置sess
        saver.restore(sess, os.path.join(checkpoint_path, 'B5678-1-60-noise7', ckpt_name))#加载训练的模型
        for imgName in imgsName: #循环处理之前遍历的图像
            blur =cv2.imread(os.path.join(input_path, imgName),-1)#读入图
            h, w = blur.shape
            x=h//512
            #print(x)
            y=w//512
            #print(y)
            if x>y:
                blur = np.pad(blur, ((0, ((x+1)*512 - h)), (0,((x+1)*512 - w))), 'edge') #把图像扩充为512*512的整数倍方便裁切
                after_deblur=np.zeros((((x+1)*512), ((x+1)*512))) #建立相同大小空矩阵
            if x<=y:
                blur = np.pad(blur, ((0, ((y+1)*512 - h)), (0,((y+1)*512 - w))), 'edge')   #把图像扩充为512*512的整数倍方便裁切
                after_deblur=np.zeros((((y+1)*512), ((y+1)*512)))#建立相同大小空矩阵
            #把图像切分成512*512的小图，依次送入神经网络得到结果
            starttotal = time.time()
            for ii in range(x+1):
                for jj in range(y+1):
                    blurPad = blur[ii * 512:(ii + 1) * 512, jj * 512:(jj + 1) * 512]      #按顺序裁切成512*512的图像块
                    blurPad = np.expand_dims(blurPad, -1)
                    blurPad = np.expand_dims(blurPad, 0)
                    if self.args.model != 'color':
                        blurPad = np.transpose(blurPad, (3, 1, 2, 0))
                    start = time.time()
                    deblur = sess.run(outputs, feed_dict={inputs: blurPad / 4095.0})#把图像块送入计算图中sess.run计算
                    duration = time.time() - start
                    res = deblur[-1]
                    res = np.clip(res, a_min=0, a_max=1)
                    if self.args.model != 'color':
                        res = np.transpose(res, (3, 1, 2, 0))
                    res = res[0, :, :, :] * 4095.0
                    res = (res.astype(np.uint16))
                    res = np.squeeze(res)
                    after_deblur = (after_deblur.astype(np.uint16))
                    after_deblur[ii * 512:(ii + 1) * 512, jj * 512:(jj + 1) * 512]=res #用计算得到的结果替换空矩阵相同位置的值
            durationtotal = time.time() - starttotal
            print('total time use %4.3fs' % (durationtotal))
            #print(after_deblur.shape)
            after_deblur = after_deblur[:h, :w] 
            after_deblur = np.clip(after_deblur, a_min=0, a_max=4095)
            #print(after_deblur.shape)
            imtiff = Image.fromarray(after_deblur)
            imtiff.save(os.path.join(output_path,imgName)) #写出图像
        sess.close()
        del sess

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除
收藏举报

10条回答默认最新

繁华落尽，寻一世真情 2021-03-21 21:23

关注

楼主，看下这两个方法能不能减少内存？我看你写的那个排序，imgsName = sorted(os.listdir(input_path))并没有起到排序的作用，正确的排序我写在代码里了，你也可以审视一下是否需要排序，如果不需要，也可以减少一些内存

#第1种
def test(self, height, width, input_path, output_path,checkpoint_path):
    #imgsName = sorted(os.listdir(input_path))#遍历文件夹中的所有图像
    
    from glob import glob
    input_path = glob(path +"\*")  #*.jpg 
    input_path.sort(key=lambda x:eval(os.path.basename(x).split(".")[0]))  #直接返回路径和名称，不用os.path.join(input_path, imgName)
    imgsName = iter(tuple(input_path))  #遍历文件夹中的所有图像  
    
    H, W = height, width
    inp_chns = 3 if self.args.model == 'color' else 1
    self.batch_size = 1 if self.args.model == 'color' else 1
    model_name = "deblur.model"
    ckpt_name = model_name + '-' + '15000'
    tf.reset_default_graph()
    graph = tf.get_default_graph() 
    inputs = tf.placeholder(shape=[self.batch_size, H, W, inp_chns], dtype=tf.float32) #输入占位符
    outputs = self.generator(inputs, reuse=False)#建立计算图
    saver = tf.train.Saver(tf.global_variables(), max_to_keep=2)
    sess=tf.Session(graph=graph,config=tf.ConfigProto(device_count={"CPU": 1},allow_soft_placement=True,inter_op_parallelism_threads=1,intra_op_parallelism_threads=1,use_per_session_threads=True))#设置sess
    saver.restore(sess, os.path.join(checkpoint_path, 'B5678-1-60-noise7', ckpt_name))#加载训练的模型
    for imgName in imgsName: #循环处理之前遍历的图像
        blur =cv2.imread(imgName,-1)#读入图
        h, w = blur.shape
        x=h//512
        #print(x)
        y=w//512
        #print(y)
        if x>y:
            blur = np.pad(blur, ((0, ((x+1)*512 - h)), (0,((x+1)*512 - w))), 'edge') #把图像扩充为512*512的整数倍方便裁切
            after_deblur=np.zeros((((x+1)*512), ((x+1)*512))) #建立相同大小空矩阵
        if x<=y:
            blur = np.pad(blur, ((0, ((y+1)*512 - h)), (0,((y+1)*512 - w))), 'edge')   #把图像扩充为512*512的整数倍方便裁切
            after_deblur=np.zeros((((y+1)*512), ((y+1)*512)))#建立相同大小空矩阵
        #把图像切分成512*512的小图，依次送入神经网络得到结果
        starttotal = time.time()
        for ii in range(x+1):
            for jj in range(y+1):
                blurPad = blur[ii * 512:(ii + 1) * 512, jj * 512:(jj + 1) * 512]      #按顺序裁切成512*512的图像块
                blurPad = np.expand_dims(blurPad, -1)
                blurPad = np.expand_dims(blurPad, 0)
                if self.args.model != 'color':
                    blurPad = np.transpose(blurPad, (3, 1, 2, 0))
                start = time.time()
                deblur = sess.run(outputs, feed_dict={inputs: blurPad / 4095.0})#把图像块送入计算图中sess.run计算
                duration = time.time() - start
                res = deblur[-1]
                res = np.clip(res, a_min=0, a_max=1)
                if self.args.model != 'color':
                    res = np.transpose(res, (3, 1, 2, 0))
                res = res[0, :, :, :] * 4095.0
                res = (res.astype(np.uint16))
                res = np.squeeze(res)
                after_deblur = (after_deblur.astype(np.uint16))
                after_deblur[ii * 512:(ii + 1) * 512, jj * 512:(jj + 1) * 512]=res #用计算得到的结果替换空矩阵相同位置的值
        durationtotal = time.time() - starttotal
        print('total time use %4.3fs' % (durationtotal))
        #print(after_deblur.shape)
        after_deblur = after_deblur[:h, :w] 
        after_deblur = np.clip(after_deblur, a_min=0, a_max=4095)
        #print(after_deblur.shape)
        imtiff = Image.fromarray(after_deblur)
        imtiff.save(os.path.join(output_path,imgName)) #写出图像
    sess.close()
    del sess
    
#第二种
def test(self, height, width, input_path, output_path,checkpoint_path):
    
    #imgsName = sorted(os.listdir(input_path))#遍历文件夹中的所有图像
    
    input_image = os.listdir(path)
    input_image.sort(key=lambda x:eval(x.split(".")[0]))
    imgsName = iter(tuple(input_image))
    
    H, W = height, width
    inp_chns = 3 if self.args.model == 'color' else 1
    self.batch_size = 1 if self.args.model == 'color' else 1
    model_name = "deblur.model"
    ckpt_name = model_name + '-' + '15000'
    tf.reset_default_graph()
    graph = tf.get_default_graph() 
    inputs = tf.placeholder(shape=[self.batch_size, H, W, inp_chns], dtype=tf.float32) #输入占位符
    outputs = self.generator(inputs, reuse=False)#建立计算图
    saver = tf.train.Saver(tf.global_variables(), max_to_keep=2)
    sess=tf.Session(graph=graph,config=tf.ConfigProto(device_count={"CPU": 1},allow_soft_placement=True,inter_op_parallelism_threads=1,intra_op_parallelism_threads=1,use_per_session_threads=True))#设置sess
    saver.restore(sess, os.path.join(checkpoint_path, 'B5678-1-60-noise7', ckpt_name))#加载训练的模型
    for imgName in imgsName: #循环处理之前遍历的图像
        blur =cv2.imread(os.path.join(input_path, imgName),-1)#读入图
        h, w = blur.shape
        x=h//512
        #print(x)
        y=w//512
        #print(y)
        if x>y:
            blur = np.pad(blur, ((0, ((x+1)*512 - h)), (0,((x+1)*512 - w))), 'edge') #把图像扩充为512*512的整数倍方便裁切
            after_deblur=np.zeros((((x+1)*512), ((x+1)*512))) #建立相同大小空矩阵
        if x<=y:
            blur = np.pad(blur, ((0, ((y+1)*512 - h)), (0,((y+1)*512 - w))), 'edge')   #把图像扩充为512*512的整数倍方便裁切
            after_deblur=np.zeros((((y+1)*512), ((y+1)*512)))#建立相同大小空矩阵
        #把图像切分成512*512的小图，依次送入神经网络得到结果
        starttotal = time.time()
        for ii in range(x+1):
            for jj in range(y+1):
                blurPad = blur[ii * 512:(ii + 1) * 512, jj * 512:(jj + 1) * 512]      #按顺序裁切成512*512的图像块
                blurPad = np.expand_dims(blurPad, -1)
                blurPad = np.expand_dims(blurPad, 0)
                if self.args.model != 'color':
                    blurPad = np.transpose(blurPad, (3, 1, 2, 0))
                start = time.time()
                deblur = sess.run(outputs, feed_dict={inputs: blurPad / 4095.0})#把图像块送入计算图中sess.run计算
                duration = time.time() - start
                res = deblur[-1]
                res = np.clip(res, a_min=0, a_max=1)
                if self.args.model != 'color':
                    res = np.transpose(res, (3, 1, 2, 0))
                res = res[0, :, :, :] * 4095.0
                res = (res.astype(np.uint16))
                res = np.squeeze(res)
                after_deblur = (after_deblur.astype(np.uint16))
                after_deblur[ii * 512:(ii + 1) * 512, jj * 512:(jj + 1) * 512]=res #用计算得到的结果替换空矩阵相同位置的值
        durationtotal = time.time() - starttotal
        print('total time use %4.3fs' % (durationtotal))
        #print(after_deblur.shape)
        after_deblur = after_deblur[:h, :w] 
        after_deblur = np.clip(after_deblur, a_min=0, a_max=4095)
        #print(after_deblur.shape)
        imtiff = Image.fromarray(after_deblur)
        imtiff.save(os.path.join(output_path,imgName)) #写出图像
    sess.close()
    del sess

报告相同问题？

关注问题

C# winform程序cpu占用率问题 c#
2018-04-28 15:38

回答 5 已采纳你的线程代码怎么写的，有没有用到同步操作，有没有用lock不正确加锁，确保耗时的操作都放在工作线程里了么？
为什么对于双核CPU而言，改变nice值无法降低其CPU的占用率？ linux
2021-11-03 17:37

回答 1 已采纳查看top帮助看看怎么显示进程在哪一个core，如果是在两个上面，这两个也不抢cpu资源
.net的MVC程序调试后iisexpress.exe Cpu占用率很高
2016-07-02 08:27

回答 1 已采纳以调试方式运行程序，点调试工具栏的中断按钮，看代码停在哪里。检查计算机上是否感染了360、百度等系列流氓软件。
基于tensorflow来分析算法训练、在线运行、预测各阶段的CPU/RAM资源占用情况
2022-08-17 20:45

小欣小欣亮晶晶的博客为了搞清楚算法在训练、发布、预测各个阶段对CPU和RAM的资源占用情况，笔者做了个小实验。
为何Xcode中显示cpu占用率100%，活动管理器里cpu却有80%闲置？ c语言
2019-04-03 21:32

回答 1 已采纳前者cpu占用率是指单线程（对于一个cpu内核）的占用率，而后者是整个系统的cpu占用率。如果你的计算机是8核心（或者4核心+smt）的，那么即便cpu占用100%，整个系统的cpu负载也只有12.5
IIS占用CPU过高，跟SQL有关系吗嵌入式实时数据库缓存运维开发
2021-08-14 02:48

回答 1 已采纳 IIS占用高应该是程序处理方面的问题，如果SQL导致的话可以看到数据库进程占用CPU高
阿里云CPU使用率偏高，由于某个守护进程，求现象解释 bash 阿里云
2015-01-08 08:40

回答 2 已采纳你这跑死循环，吃掉CPU很正常吧。
【已证实】训练神经网络时，GPU利用率低而显存占用率高的思考
2022-05-19 23:11

yzZ_here的博客 GPU利用率低而显存占用率高的思考问题描述：在深度学习training中，可以看到cpu利用率很高、内存占用率很高、显存占用率很高，但GPU利用率很低，train比较耗时。本文对这个问题进行了思考。（仅仅是自己遇到...
java程序占用linux cpu过大，查看cpu占用最大的线程，如何看信息 java linux
2021-04-22 11:10

回答 2 已采纳 man top # 线程下的所有进程情况 top -p 15124 -H # 转换16进制进程号(例如15125对应3b15) printf '%x\n' 15125 # 线程dump jstack
eclipes占用过多cpu怎么办 java
2017-06-20 13:15

回答 2 已采纳主要应该是内存小，建议增加内存，软件方面只能是把服务里不需要的暂时关闭，然后一般的杀毒软件都可以清理运行内存
QT中如何获取到任务管理器中的CPU使用率呢 qt
2022-07-12 15:46

回答 1 已采纳有APIhttps://blog.csdn.net/ET_Endeavoring/article/details/117201627http://t.zoukankan.com/fistao-p-31
深度学习PyTorch，TensorFlow中GPU利用率较低，CPU利用率很低，且模型训练速度很慢的问题总结与分析
2020-06-02 14:28

Jeanshoe的博客在深度学习模型训练过程中，在服务器端或者本地pc端，输入nvidia-smi来观察显卡的GPU内存占用率（Memory-Usage），显卡的GPU利用率（GPU-util），然后采用top来查看CPU的线程数（PID数）和利用率（%CPU）。...
ffmpeg压缩视频文件CPU占用率大
2015-07-04 15:36

回答 5 已采纳压缩视频本身就是非常耗费cpu的工作，你可以改小输出的图像质量减少运算量，或者增加计算机上的cpu的性能，比如使用更强的，有更多内核数的cpu。对于服务器，可以将压缩任务用队列存起来，放在低峰时段
TPU究竟是什么，它的优点有哪些，如何实现对深度学习模型的训练？如何在TPU上运行tensorflow或pytorch模型？有什么限制？
2023-08-11 03:00

禅与计算机程序设计艺术的博客在这篇博文中，我将阐述Google的Cloud TPUs (Tensor Processing Unit)的用途、特性、性能、适应性以及...另外，也会了解到如何在TPU上运行tensorflow或pytorch模型，最后介绍了TPU的一些限制，以及可能遇到的一些问题。
Kaggle平台上运行TensorFlow时GPU利用率为0
2022-02-19 22:45

甜面包兑啤酒的博客 ** 待解决 ** 今天在Kaggle上试一下TextCNN，涉及到TensorFlow。但是训练模型时发现GPU利用率为0，而且整个过程非常慢。隔壁的CPU都到顶了… 查了很多方法，说是和cuda版本不对应 ...
没有解决我的问题, 去提问

悬赏问题

¥15 YoloV5 第三方库的版本对照问题
¥15 请完成下列相关问题！
¥15 drone 推送镜像时候 purge: true 推送完毕后没有删除对应的镜像,手动拷贝到服务器执行结果正确在样才能让指令自动执行成功删除对应镜像，如何解决？
¥15 求daily translation（DT）偏差订正方法的代码
¥15 js调用html页面需要隐藏某个按钮
¥15 ads仿真结果在圆图上是怎么读数的
¥20 Cotex M3的调试和程序执行方式是什么样的？
¥20 java项目连接sqlserver时报ssl相关错误
¥15 一道python难题3
¥15 牛顿斯科特系数表表示

码龄粉丝数原力等级 --

基于tensorflow的模型做预测时，cpu占用率过高，如何降低程序的cpu占用率？

10条回答默认最新

码龄粉丝数原力等级 --

悬赏问题

基于tensorflow的模型做预测时，cpu占用率过高，如何降低程序的cpu占用率？

10条回答 默认 最新

悬赏问题

10条回答默认最新