tensorflow.GraphDef was modified concurrently during serialization

图片说明

Create a saver object which will save all the variables

        saver = tf.train.Saver(keep_checkpoint_every_n_hours=1.0)
        if FLAGS.pre_trained_checkpoint:
            train_utils.restore_fn(FLAGS)

        start_epoch = 0
        # Get the number of training/validation steps per epoch
        tr_batches = int(MODELNET_TRAIN_DATA_SIZE / FLAGS.batch_size)
        if MODELNET_TRAIN_DATA_SIZE % FLAGS.batch_size > 0:
            tr_batches += 1
        val_batches = int(MODELNET_VALIDATE_DATA_SIZE / FLAGS.batch_size)
        if MODELNET_VALIDATE_DATA_SIZE % FLAGS.batch_size > 0:
            val_batches += 1

        # The filenames argument to the TFRecordDataset initializer can either be a string,
        # a list of strings, or a tf.Tensor of strings.
        training_filenames = os.path.join(FLAGS.dataset_dir, 'train.record')
        validate_filenames = os.path.join(FLAGS.dataset_dir, 'validate.record')
        ##################
        # Training loop.
        ##################
        for training_epoch in range(start_epoch, FLAGS.how_many_training_epochs):
            print("-------------------------------------")
            print(" Epoch {} ".format(training_epoch))
            print("-------------------------------------")

            sess.run(iterator.initializer, feed_dict={filenames: training_filenames})
            for step in range(tr_batches):
                # Pull the image batch we'll use for training.
                train_batch_xs, train_batch_ys = sess.run(next_batch)

                handle = sess.partial_run_setup([d_scores, final_desc, learning_rate, summary_op,
                                                 accuracy, total_loss, grad_summ_op, train_op],
                                                [X, final_X, ground_truth,
                                                 grouping_scheme, grouping_weight, is_training,
                                                 is_training2, dropout_keep_prob])

                scores, final = sess.partial_run(handle,
                                                 [d_scores, final_desc],
                                                 feed_dict={
                                                    X: train_batch_xs,
                                                    is_training: True}
                                                 )
                schemes = gvcnn.grouping_scheme(scores, NUM_GROUP, FLAGS.num_views)
                weights = gvcnn.grouping_weight(scores, schemes)

                # Run the graph with this batch of training data.
                lr, train_summary, train_accuracy, train_loss, grad_vals, _ = \
                    sess.partial_run(handle,
                                     [learning_rate, summary_op, accuracy, total_loss, grad_summ_op, train_op],
                                     feed_dict={
                                         final_X: final,
                                         ground_truth: train_batch_ys,
                                         grouping_scheme: schemes,
                                         grouping_weight: weights,
                                         is_training2: True,
                                         dropout_keep_prob: 0.8}
                                     )

                train_writer.add_summary(train_summary, training_epoch)
                train_writer.add_summary(grad_vals, training_epoch)
                tf.logging.info('Epoch #%d, Step #%d, rate %.10f, accuracy %.1f%%, loss %f' %
                                (training_epoch, step, lr, train_accuracy * 100, train_loss))



            # Save the model checkpoint periodically.
            if (training_epoch <= FLAGS.how_many_training_epochs-1):
                checkpoint_path = os.path.join(FLAGS.train_logdir, FLAGS.ckpt_name_to_save)
                tf.logging.info('Saving to "%s-%d"', checkpoint_path, training_epoch)
                saver.save(sess, checkpoint_path, global_step=training_epoch)

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

2条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
AIShark 2019-10-23 16:26
关注
用CheckpointSaverHook来做模型保存，不要自己写，session run的行为有可能是异步优化后并发的，不是依照python代码的串行关系执行的。
所以一般在一个循环内不会串行调用session run。如果需要运行多个OP一般是拼接成dict或者tuple传递给一次session.run。

本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(1条)

报告相同问题？

关注问题

tensorflow.GraphDef was modified concurrently during serialization python tensorflow 人工智能机器学习深度学习
2019-09-17 00:16

回答 2 已采纳用CheckpointSaverHook来做模型保存，不要自己写，session run的行为有可能是异步优化后并发的，不是依照python代码的串行关系执行的。所以一般在一个循环内不会串行调用s
确保只有一个goroutine正在使用sync.Map运行
2018-02-19 08:57

回答 2 已采纳 You're starting a goroutine to do the work, then immediately deleting the key, before the work has
自己用webpack构建vue项目报错，求帮助万分感谢 npm vue.js webpack
2018-10-17 01:26

回答 1 已采纳是webpack版本问题
tensorflow.GraphDef was modified concurrently during serialization
2018-10-26 09:02

Takoony的博客 CHECK failed: (byte_size_before_serialization) == (byte_size_after_serialization): tensorflow.GraphDef was modified concurrently during serialization 网上搜索：原因是变量大多了，...
在Golang中运行外部python，捕获连续的exec.Command Stdout
2017-01-01 12:09

回答 1 已采纳 Use cmd.Start() and cmd.Wait() instead of cmd.Run(). https://golang.org/pkg/os/exec/#Cmd.Run
Go testing.B基准能否防止不必要的优化？
2016-05-01 12:59

回答 3 已采纳 Converting my comment to an answer. To be completely accurate, any benchmark should be careful
为什么os / exec.CombinedOutput（）没有竞争条件？
2017-03-31 10:36

回答 1 已采纳 Stderr and Stdout are not written concurrently if they are the same writer as stated in the Cmd do
logger 日志字符日志
2018-12-13 14:52

IT_YL的博客地址： http://patorjk.com/software/taag/ 效果如下，（进网站后可能要加载2分钟左右） __ __ _ \ \ / /| | \ V / | | ___ _ __ __ _ \ / | | / _ \ | '_ \ / _` | | | ...
gopkg.in/mgo.v2中的并发（Mongo，Go） mongodb
2017-02-27 17:25

回答 2 已采纳 The mgo.Session is safe for concurrent use. Quoting from its doc: All Session methods are conc
同时调用listener.Accept（）
2016-03-26 05:34

回答 1 已采纳 net.Listener is a FileDescriptor under the hood. Accept() use Plan9 machinery which guards it with
golang sync.WaitGroup永远不会完成
2015-01-11 23:26

回答 3 已采纳 There are two problems with this code. First, you have to pass a pointer to the WaitGroup to downl
concurrently nike free run make
2014-02-20 14:01

ciyangshi9572的博客 In the twelve months approximately In 2010, To create absolutely free themes additional benefit, concurrently make this happen perform to the militant athletics much better, Nike, all over again, ...
如何在foreach内部打开文件（glob ... PHP php
2014-12-08 03:26

回答 1 已采纳 There's no reason why you shouldn't be able to open multiple files and no reason the foreach shoul
tensorflow踩坑记
2020-02-06 16:28

Evan_Tech的博客 tensorflow.GraphDef was modified terminate called after throwing an instance of 'google::protobuf::FatalException' what(): CHECK failed: (byte_size_before_serialization) == (byte_size_after_seriali.....
python怎么在数据集中加入新_关于python：如何在估算器训练期间动态加载数据集的新部分？...
2020-12-24 06:30

科学声音的博客 == (byte_size_after_serialization): tensorflow.GraphDef was modified concurrently during serialization. terminate called after throwing an instance of 'google::protobuf::FatalException' what(): CHECK ...
hive遇到的问题java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expec
2017-08-11 17:27

knowfarhhy的博客针对这个java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected错误如下解答：尝试了把hadoop2.6里面的jline 的jar包删除不成功，然后又尝试了把hadoop2.6与hive...
JDK(jdk1.7与1.8)源码剖析之 java.util.HashMap
2021-12-12 09:26

dnbug Blog的博客 * If multiple threads access a hash map concurrently, and at least one of * the threads modifies the map structurally, it must be * synchronized externally. (A structural modification is any ...
06. 数据结构之散列表
2023-05-29 16:47

wlyang666的博客散列表也叫作哈希表（hash table），这种数据结构提供了键（Key）和值（Value）的映射关系。只要给出一个Key，就可以高效查找到它所匹配的Value，时间复杂度接近于O(1)
05.抽象队列同步器AQS应用之Lock详解
2022-12-11 17:32

一路向北·重庆分伦的博客 AQS
Node.js v12.16.2 Documentation HTTP/2
2020-04-16 19:33

Bol5261的博客 Index View on single page View as JSON View another version ▼ Edit on GitHub Table of Contents HTTP/2 Core API Server-side example Client-side example Class: Http2Ses...
没有解决我的问题, 去提问

悬赏问题

¥15 程序不包含适用于入口点的静态Main方法
¥15 素材场景中光线烘焙后灯光失效
¥15 请教一下各位，为什么我这个没有实现模拟点击
¥15 执行 virtuoso 命令后，界面没有，cadence 启动不起来
¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
¥20 有关区间dp的问题求解
¥15 多电路系统共用电源的串扰问题
¥15 slam rangenet++配置
¥15 有没有研究水声通信方面的帮我改俩matlab代码
¥15 ubuntu子系统密码忘记

tensorflow.GraphDef was modified concurrently during serialization

Create a saver object which will save all the variables

2条回答 默认 最新

悬赏问题

2条回答默认最新