c++内如何导入tensorflow中训练好的模型 10C

我现在用tensorflow训练了一个自己搭建的卷积神经网络,现在需要再用c++实现该网络,不知道如何将这个模型保存成文本文件,然后再导入c++

2个回答

tensorflow有c++ api可以对pb文件进行inference,流程和python差不多 建立tensor及节点名,session.run推理。

Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
其他相关推荐
Python+OpenCV计算机视觉

Python+OpenCV计算机视觉

c++内如何导入tensorflow中训练好的模型

我现在用tensorflow训练了一个自己搭建的卷积神经网络,现在需要再用c++实现该网络,不知道如何将这个模型保存成文本文件,然后再导入c++

tensorflow重载模型继续训练得到的loss比原模型继续训练得到的loss大,是什么原因??

我使用tensorflow训练了一个模型,在第10个epoch时保存模型,然后在一个新的文件里重载模型继续训练,结果我发现重载的模型在第一个epoch的loss比原模型在epoch=11的loss要大,我感觉既然是重载了原模型,那么重载模型训练的第一个epoch应该是和原模型训练的第11个epoch相等的,一直找不到问题或者自己逻辑的问题,希望大佬能指点迷津。源代码和重载模型的代码如下: ``` 原代码: from tensorflow.examples.tutorials.mnist import input_data import tensorflow as tf import os import numpy as np os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' mnist = input_data.read_data_sets("./",one_hot=True) tf.reset_default_graph() ###定义数据和标签 n_inputs = 784 n_classes = 10 X = tf.placeholder(tf.float32,[None,n_inputs],name='X') Y = tf.placeholder(tf.float32,[None,n_classes],name='Y') ###定义网络结构 n_hidden_1 = 256 n_hidden_2 = 256 layer_1 = tf.layers.dense(inputs=X,units=n_hidden_1,activation=tf.nn.relu,kernel_regularizer=tf.contrib.layers.l2_regularizer(0.01)) layer_2 = tf.layers.dense(inputs=layer_1,units=n_hidden_2,activation=tf.nn.relu,kernel_regularizer=tf.contrib.layers.l2_regularizer(0.01)) outputs = tf.layers.dense(inputs=layer_2,units=n_classes,name='outputs') pred = tf.argmax(tf.nn.softmax(outputs,axis=1),axis=1) print(pred.name) err = tf.count_nonzero((pred - tf.argmax(Y,axis=1))) cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=outputs,labels=Y),name='cost') print(cost.name) ###定义优化器 learning_rate = 0.001 optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost,name='OP') saver = tf.train.Saver() checkpoint = 'softmax_model/dense_model.cpkt' ###训练 batch_size = 100 training_epochs = 11 with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for epoch in range(training_epochs): batch_num = int(mnist.train.num_examples / batch_size) epoch_cost = 0 sumerr = 0 for i in range(batch_num): batch_x,batch_y = mnist.train.next_batch(batch_size) c,e = sess.run([cost,err],feed_dict={X:batch_x,Y:batch_y}) _ = sess.run(optimizer,feed_dict={X:batch_x,Y:batch_y}) epoch_cost += c / batch_num sumerr += e / mnist.train.num_examples if epoch == (training_epochs - 1): print('batch_cost = ',c) if epoch == (training_epochs - 2): saver.save(sess, checkpoint) print('test_error = ',sess.run(cost, feed_dict={X: mnist.test.images, Y: mnist.test.labels})) ``` ``` 重载模型的代码: from tensorflow.examples.tutorials.mnist import input_data import tensorflow as tf import os os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' mnist = input_data.read_data_sets("./",one_hot=True) #one_hot=True指对样本标签进行独热编码 file_path = 'softmax_model/dense_model.cpkt' saver = tf.train.import_meta_graph(file_path + '.meta') graph = tf.get_default_graph() X = graph.get_tensor_by_name('X:0') Y = graph.get_tensor_by_name('Y:0') cost = graph.get_operation_by_name('cost').outputs[0] train_op = graph.get_operation_by_name('OP') training_epoch = 10 learning_rate = 0.001 batch_size = 100 with tf.Session() as sess: saver.restore(sess,file_path) print('test_cost = ',sess.run(cost, feed_dict={X: mnist.test.images, Y: mnist.test.labels})) for epoch in range(training_epoch): batch_num = int(mnist.train.num_examples / batch_size) epoch_cost = 0 for i in range(batch_num): batch_x, batch_y = mnist.train.next_batch(batch_size) c = sess.run(cost, feed_dict={X: batch_x, Y: batch_y}) _ = sess.run(train_op, feed_dict={X: batch_x, Y: batch_y}) epoch_cost += c / batch_num print(epoch_cost) ``` 值得注意的是,我在原模型和重载模型里都计算了测试集的cost,两者的结果是一致的。说明参数载入应该是对的

求助,Tensorflow搭建AlexNet模型是训练集验证集的LOSS不收敛

如题,代码如下,请大佬赐教 ``` # coding:utf-8 import tensorflow as tf import numpy as np import time import os import cv2 import matplotlib.pyplot as plt def get_file(file_dir): images = [] labels = [] for root, sub_folders, files in os.walk(file_dir): for name in files: images.append(os.path.join(root, name)) letter = name.split('.')[0] # 对标签进行分类 if letter == 'cat': labels = np.append(labels, [0]) else: labels = np.append(labels, [1]) # shuffle(随机打乱) temp = np.array([images, labels]) temp = temp.transpose() # 建立images 与 labels 之间关系, 以矩阵形式展现 np.random.shuffle(temp) image_list = list(temp[:, 0]) label_list = list(temp[:, 1]) label_list = [int(float(i)) for i in label_list] print(image_list) print(label_list) return image_list, label_list # 返回文件名列表 def _parse_function(image_list, labels_list): image_contents = tf.read_file(image_list) image = tf.image.decode_jpeg(image_contents, channels=3) image = tf.cast(image, tf.float32) image = tf.image.resize_image_with_crop_or_pad(image, 227, 227) # 剪裁或填充处理 image = tf.image.per_image_standardization(image) # 图片标准化 labels = labels_list return image, labels # 将需要读取的数据集地址转换为专用格式 def get_batch(image_list, labels_list, batch_size): image_list = tf.cast(image_list, tf.string) labels_list = tf.cast(labels_list, tf.int32) dataset = tf.data.Dataset.from_tensor_slices((image_list, labels_list)) # 创建dataset dataset = dataset.repeat() # 无限循环 dataset = dataset.map(_parse_function) dataset = dataset.batch(batch_size) dataset = dataset.make_one_shot_iterator() return dataset # 正则化处理数据集 def batch_norm(inputs, is_training, is_conv_out=True, decay=0.999): scale = tf.Variable(tf.ones([inputs.get_shape()[-1]])) beta = tf.Variable(tf.zeros([inputs.get_shape()[-1]])) pop_mean = tf.Variable(tf.zeros([inputs.get_shape()[-1]]), trainable=False) pop_var = tf.Variable(tf.ones([inputs.get_shape()[-1]]), trainable=False) def batch_norm_train(): if is_conv_out: batch_mean, batch_var = tf.nn.moments(inputs, [0, 1, 2]) # 求均值及方差 else: batch_mean, batch_var = tf.nn.moments(inputs, [0]) train_mean = tf.assign(pop_mean, pop_mean * decay + batch_mean * (1 - decay)) train_var = tf.assign(pop_var, pop_var * decay + batch_var * (1 - decay)) with tf.control_dependencies([train_mean, train_var]): # 在train_mean, train_var计算完条件下继续 return tf.nn.batch_normalization(inputs, batch_mean, batch_var, beta, scale, 0.001) def batch_norm_test(): return tf.nn.batch_normalization(inputs, pop_mean, pop_var, beta, scale, 0.001) batch_normalization = tf.cond(is_training, batch_norm_train, batch_norm_test) return batch_normalization # 建立模型 learning_rate = 1e-4 training_iters = 200 batch_size = 50 display_step = 5 n_classes = 2 n_fc1 = 4096 n_fc2 = 2048 # 构建模型 x = tf.placeholder(tf.float32, [None, 227, 227, 3]) y = tf.placeholder(tf.int32, [None]) is_training = tf.placeholder(tf.bool) # 字典模式管理权重与偏置参数 W_conv = { 'conv1': tf.Variable(tf.truncated_normal([11, 11, 3, 96], stddev=0.0001)), 'conv2': tf.Variable(tf.truncated_normal([5, 5, 96, 256], stddev=0.01)), 'conv3': tf.Variable(tf.truncated_normal([3, 3, 256, 384], stddev=0.01)), 'conv4': tf.Variable(tf.truncated_normal([3, 3, 384, 384], stddev=0.01)), 'conv5': tf.Variable(tf.truncated_normal([3, 3, 384, 256], stddev=0.01)), 'fc1': tf.Variable(tf.truncated_normal([6 * 6 * 256, n_fc1], stddev=0.1)), 'fc2': tf.Variable(tf.truncated_normal([n_fc1, n_fc2], stddev=0.1)), 'fc3': tf.Variable(tf.truncated_normal([n_fc2, n_classes], stddev=0.1)), } b_conv = { 'conv1': tf.Variable(tf.constant(0.0, dtype=tf.float32, shape=[96])), 'conv2': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[256])), 'conv3': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[384])), 'conv4': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[384])), 'conv5': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[256])), 'fc1': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[n_fc1])), 'fc2': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[n_fc2])), 'fc3': tf.Variable(tf.constant(0.0, dtype=tf.float32, shape=[n_classes])), } x_image = tf.reshape(x, [-1, 227, 227, 3]) # 卷积层,池化层,LRN层编写 # 第一层卷积层 # 卷积层1 conv1 = tf.nn.conv2d(x_image, W_conv['conv1'], strides=[1, 4, 4, 1], padding='VALID') conv1 = tf.nn.bias_add(conv1, b_conv['conv1']) conv1 = batch_norm(conv1, is_training) #conv1 = tf.layers.batch_normalization(conv1, training=is_training) conv1 = tf.nn.relu(conv1) # 池化层1 pool1 = tf.nn.avg_pool(conv1, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='VALID') # LRN层 norm1 = tf.nn.lrn(pool1, 5, bias=1.0, alpha=0.001 / 9.0, beta=0.75) # 第二层卷积 # 卷积层2 conv2 = tf.nn.conv2d(norm1, W_conv['conv2'], strides=[1, 1, 1, 1], padding='SAME') conv2 = tf.nn.bias_add(conv2, b_conv['conv2']) #conv2 = tf.layers.batch_normalization(conv2, training=is_training) conv2 = batch_norm(conv2, is_training) conv2 = tf.nn.relu(conv2) # 池化层2 pool2 = tf.nn.avg_pool(conv2, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='VALID') # LRN层 #norm2 = tf.nn.lrn(pool2, 5, bias=1.0, alpha=0.001 / 9.0, beta=0.75) # 第三层卷积 # 卷积层3 conv3 = tf.nn.conv2d(pool2, W_conv['conv3'], strides=[1, 1, 1, 1], padding='SAME') conv3 = tf.nn.bias_add(conv3, b_conv['conv3']) #conv3 = tf.layers.batch_normalization(conv3, training=is_training) conv3 = batch_norm(conv3, is_training) conv3 = tf.nn.relu(conv3) # 第四层卷积 # 卷积层4 conv4 = tf.nn.conv2d(conv3, W_conv['conv4'], strides=[1, 1, 1, 1], padding='SAME') conv4 = tf.nn.bias_add(conv4, b_conv['conv4']) #conv4 = tf.layers.batch_normalization(conv4, training=is_training) conv4 = batch_norm(conv4, is_training) conv4 = tf.nn.relu(conv4) # 第五层卷积 # 卷积层5 conv5 = tf.nn.conv2d(conv4, W_conv['conv5'], strides=[1, 1, 1, 1], padding='SAME') conv5 = tf.nn.bias_add(conv5, b_conv['conv5']) #conv5 = tf.layers.batch_normalization(conv5, training=is_training) conv5 = batch_norm(conv5, is_training) conv5 = tf.nn.relu(conv5) # 池化层5 pool5 = tf.nn.avg_pool(conv5, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='VALID') # 第六层全连接 reshape = tf.reshape(pool5, [-1, 6 * 6 * 256]) #fc1 = tf.matmul(reshape, W_conv['fc1']) fc1 = tf.add(tf.matmul(reshape, W_conv['fc1']), b_conv['fc1']) #fc1 = tf.layers.batch_normalization(fc1, training=is_training) fc1 = batch_norm(fc1, is_training, False) fc1 = tf.nn.relu(fc1) #fc1 = tf.nn.dropout(fc1, 0.5) # 第七层全连接 #fc2 = tf.matmul(fc1, W_conv['fc2']) fc2 = tf.add(tf.matmul(fc1, W_conv['fc2']), b_conv['fc2']) #fc2 = tf.layers.batch_normalization(fc2, training=is_training) fc2 = batch_norm(fc2, is_training, False) fc2 = tf.nn.relu(fc2) #fc2 = tf.nn.dropout(fc2, 0.5) # 第八层全连接(分类层) yop = tf.add(tf.matmul(fc2, W_conv['fc3']), b_conv['fc3']) # 损失函数 #y = tf.stop_gradient(y) loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=yop, labels=y)) optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss) #update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) #with tf.control_dependencies(update_ops): # 保证train_op在update_ops执行之后再执行。 #train_op = optimizer.minimize(loss) # 评估模型 correct_predict = tf.nn.in_top_k(yop, y, 1) accuracy = tf.reduce_mean(tf.cast(correct_predict, tf.float32)) init = tf.global_variables_initializer() def onehot(labels): # 独热编码表示数据 n_sample = len(labels) n_class = max(labels) + 1 onehot_labels = np.zeros((n_sample, n_class)) onehot_labels[np.arange(n_sample), labels] = 1 # python迭代方法,将每一行对应个置1 return onehot_labels save_model = './/model//my-model.ckpt' # 模型训练 def train(epoch): with tf.Session() as sess: sess.run(init) saver = tf.train.Saver(var_list=tf.global_variables()) c = [] b = [] max_acc = 0 start_time = time.time() step = 0 global dataset dataset = dataset.get_next() for i in range(epoch): step = i image, labels = sess.run(dataset) sess.run(optimizer, feed_dict={x: image, y: labels, is_training: True}) # 训练一次 #if i % 5 == 0: loss_record = sess.run(loss, feed_dict={x: image, y: labels, is_training: True}) # 记录一次 #predict = sess.run(yop, feed_dict={x: image, y: labels, is_training: True}) acc = sess.run(accuracy, feed_dict={x: image, y: labels, is_training: True}) print("step:%d, now the loss is %f" % (step, loss_record)) #print(predict[0]) print("acc : %f" % acc) c.append(loss_record) b.append(acc) end_time = time.time() print('time:', (end_time - start_time)) start_time = end_time print('-----------%d opench is finished ------------' % (i / 5)) #if acc > max_acc: # max_acc = acc # saver.save(sess, save_model, global_step=i + 1) print('Optimization Finished!') #saver.save(sess, save_model) print('Model Save Finished!') plt.plot(c) plt.plot(b) plt.xlabel('iter') plt.ylabel('loss') plt.title('lr=%f, ti=%d, bs=%d' % (learning_rate, training_iters, batch_size)) plt.tight_layout() plt.show() X_train, y_train = get_file("D://cat_and_dog//cat_dog_train//cat_dog") # 返回为文件地址 dataset = get_batch(X_train, y_train, 100) train(100) ``` 数据文件夹为猫狗大战那个25000个图片的文件,不加入正则表达层的时候训练集loss会下降,但是acc维持不变,加入__batch norm__或者__tf.layers.batch__normalization 训练集和验证机的loss都不收敛了

TensorFlow1.14.0 训练自己模型出错 Windows fatal exception: access violation Current thread 0x00002c38 (most recent call first)

W0424 18:27:53.741117 11320 deprecation_wrapper.py:119] From D:\Anaconda\envs\tensorflow\lib\site-packages\object_detection-0.1-py3.6.egg\object_detection\data_decoders\tf_example_decoder.py:197: The name tf.VarLenFeature is deprecated. Please use tf.io.VarLenFeature instead. Windows fatal exception: access violation Current thread 0x00002c38 (most recent call first): File "D:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 84 in _preread_check File "D:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 122 in read File "D:\Anaconda\envs\tensorflow\lib\site-packages\object_detection-0.1-py3.6.egg\object_detection\utils\label_map_util.py", line 139 in load_labelmap File "D:\Anaconda\envs\tensorflow\lib\site-packages\object_detection-0.1-py3.6.egg\object_detection\utils\label_map_util.py", line 172 in get_label_map_dict File "D:\Anaconda\envs\tensorflow\lib\site-packages\object_detection-0.1-py3.6.egg\object_detection\data_decoders\tf_example_decoder.py", line 64 in __init__ File "D:\Anaconda\envs\tensorflow\lib\site-packages\object_detection-0.1-py3.6.egg\object_detection\data_decoders\tf_example_decoder.py", line 319 in __init__ File "D:\Anaconda\envs\tensorflow\lib\site-packages\object_detection-0.1-py3.6.egg\object_detection\builders\dataset_builder.py", line 130 in build File "train.py", line 122 in get_next File "D:\Anaconda\envs\tensorflow\lib\site-packages\object_detection-0.1-py3.6.egg\object_detection\legacy\trainer.py", line 60 in create_input_queue File "D:\Anaconda\envs\tensorflow\lib\site-packages\object_detection-0.1-py3.6.egg\object_detection\legacy\trainer.py", line 281 in train File "train.py", line 181 in main File "D:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\util\deprecation.py", line 324 in new_func File "D:\Anaconda\envs\tensorflow\lib\site-packages\absl\app.py", line 250 in _run_main File "D:\Anaconda\envs\tensorflow\lib\site-packages\absl\app.py", line 299 in run File "D:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\platform\app.py", line 40 in run File "train.py", line 185 in <module> ``` ``` 两三天了还没有解决,尝试换到1.12.0-gpu版本会出现tensorflow._api.compat没有v1的问题。求大神解答!!!!

tensorflow模型推理,两个列表串行,输出结果是第一个列表的循环,新手求教

tensorflow模型推理,两个列表串行,输出结果是第一个列表的循环,新手求教 ``` from __future__ import print_function import argparse from datetime import datetime import os import sys import time import scipy.misc import scipy.io as sio import cv2 from glob import glob import multiprocessing os.environ["CUDA_VISIBLE_DEVICES"] = "0" import tensorflow as tf import numpy as np from PIL import Image from utils import * N_CLASSES = 20 DATA_DIR = './datasets/CIHP' LIST_PATH = './datasets/CIHP/list/val2.txt' DATA_ID_LIST = './datasets/CIHP/list/val_id2.txt' with open(DATA_ID_LIST, 'r') as f: NUM_STEPS = len(f.readlines()) RESTORE_FROM = './checkpoint/CIHP_pgn' # Load reader. with tf.name_scope("create_inputs") as scp1: reader = ImageReader(DATA_DIR, LIST_PATH, DATA_ID_LIST, None, False, False, False, None) image, label, edge_gt = reader.image, reader.label, reader.edge image_rev = tf.reverse(image, tf.stack([1])) image_list = reader.image_list image_batch = tf.stack([image, image_rev]) label_batch = tf.expand_dims(label, dim=0) # Add one batch dimension. edge_gt_batch = tf.expand_dims(edge_gt, dim=0) h_orig, w_orig = tf.to_float(tf.shape(image_batch)[1]), tf.to_float(tf.shape(image_batch)[2]) image_batch050 = tf.image.resize_images(image_batch, tf.stack([tf.to_int32(tf.multiply(h_orig, 0.50)), tf.to_int32(tf.multiply(w_orig, 0.50))])) image_batch075 = tf.image.resize_images(image_batch, tf.stack([tf.to_int32(tf.multiply(h_orig, 0.75)), tf.to_int32(tf.multiply(w_orig, 0.75))])) image_batch125 = tf.image.resize_images(image_batch, tf.stack([tf.to_int32(tf.multiply(h_orig, 1.25)), tf.to_int32(tf.multiply(w_orig, 1.25))])) image_batch150 = tf.image.resize_images(image_batch, tf.stack([tf.to_int32(tf.multiply(h_orig, 1.50)), tf.to_int32(tf.multiply(w_orig, 1.50))])) image_batch175 = tf.image.resize_images(image_batch, tf.stack([tf.to_int32(tf.multiply(h_orig, 1.75)), tf.to_int32(tf.multiply(w_orig, 1.75))])) ``` 新建网络 ``` # Create network. with tf.variable_scope('', reuse=False) as scope: net_100 = PGNModel({'data': image_batch}, is_training=False, n_classes=N_CLASSES) with tf.variable_scope('', reuse=True): net_050 = PGNModel({'data': image_batch050}, is_training=False, n_classes=N_CLASSES) with tf.variable_scope('', reuse=True): net_075 = PGNModel({'data': image_batch075}, is_training=False, n_classes=N_CLASSES) with tf.variable_scope('', reuse=True): net_125 = PGNModel({'data': image_batch125}, is_training=False, n_classes=N_CLASSES) with tf.variable_scope('', reuse=True): net_150 = PGNModel({'data': image_batch150}, is_training=False, n_classes=N_CLASSES) with tf.variable_scope('', reuse=True): net_175 = PGNModel({'data': image_batch175}, is_training=False, n_classes=N_CLASSES) # parsing net parsing_out1_050 = net_050.layers['parsing_fc'] parsing_out1_075 = net_075.layers['parsing_fc'] parsing_out1_100 = net_100.layers['parsing_fc'] parsing_out1_125 = net_125.layers['parsing_fc'] parsing_out1_150 = net_150.layers['parsing_fc'] parsing_out1_175 = net_175.layers['parsing_fc'] parsing_out2_050 = net_050.layers['parsing_rf_fc'] parsing_out2_075 = net_075.layers['parsing_rf_fc'] parsing_out2_100 = net_100.layers['parsing_rf_fc'] parsing_out2_125 = net_125.layers['parsing_rf_fc'] parsing_out2_150 = net_150.layers['parsing_rf_fc'] parsing_out2_175 = net_175.layers['parsing_rf_fc'] # edge net edge_out2_100 = net_100.layers['edge_rf_fc'] edge_out2_125 = net_125.layers['edge_rf_fc'] edge_out2_150 = net_150.layers['edge_rf_fc'] edge_out2_175 = net_175.layers['edge_rf_fc'] # combine resize parsing_out1 = tf.reduce_mean(tf.stack([tf.image.resize_images(parsing_out1_050, tf.shape(image_batch)[1:3,]), tf.image.resize_images(parsing_out1_075, tf.shape(image_batch)[1:3,]), tf.image.resize_images(parsing_out1_100, tf.shape(image_batch)[1:3,]), tf.image.resize_images(parsing_out1_125, tf.shape(image_batch)[1:3,]), tf.image.resize_images(parsing_out1_150, tf.shape(image_batch)[1:3,]), tf.image.resize_images(parsing_out1_175, tf.shape(image_batch)[1:3,])]), axis=0) parsing_out2 = tf.reduce_mean(tf.stack([tf.image.resize_images(parsing_out2_050, tf.shape(image_batch)[1:3,]), tf.image.resize_images(parsing_out2_075, tf.shape(image_batch)[1:3,]), tf.image.resize_images(parsing_out2_100, tf.shape(image_batch)[1:3,]), tf.image.resize_images(parsing_out2_125, tf.shape(image_batch)[1:3,]), tf.image.resize_images(parsing_out2_150, tf.shape(image_batch)[1:3,]), tf.image.resize_images(parsing_out2_175, tf.shape(image_batch)[1:3,])]), axis=0) edge_out2_100 = tf.image.resize_images(edge_out2_100, tf.shape(image_batch)[1:3,]) edge_out2_125 = tf.image.resize_images(edge_out2_125, tf.shape(image_batch)[1:3,]) edge_out2_150 = tf.image.resize_images(edge_out2_150, tf.shape(image_batch)[1:3,]) edge_out2_175 = tf.image.resize_images(edge_out2_175, tf.shape(image_batch)[1:3,]) edge_out2 = tf.reduce_mean(tf.stack([edge_out2_100, edge_out2_125, edge_out2_150, edge_out2_175]), axis=0) raw_output = tf.reduce_mean(tf.stack([parsing_out1, parsing_out2]), axis=0) head_output, tail_output = tf.unstack(raw_output, num=2, axis=0) tail_list = tf.unstack(tail_output, num=20, axis=2) tail_list_rev = [None] * 20 for xx in range(14): tail_list_rev[xx] = tail_list[xx] tail_list_rev[14] = tail_list[15] tail_list_rev[15] = tail_list[14] tail_list_rev[16] = tail_list[17] tail_list_rev[17] = tail_list[16] tail_list_rev[18] = tail_list[19] tail_list_rev[19] = tail_list[18] tail_output_rev = tf.stack(tail_list_rev, axis=2) tail_output_rev = tf.reverse(tail_output_rev, tf.stack([1])) raw_output_all = tf.reduce_mean(tf.stack([head_output, tail_output_rev]), axis=0) raw_output_all = tf.expand_dims(raw_output_all, dim=0) pred_scores = tf.reduce_max(raw_output_all, axis=3) raw_output_all = tf.argmax(raw_output_all, axis=3) pred_all = tf.expand_dims(raw_output_all, dim=3) # Create 4-d tensor. raw_edge = tf.reduce_mean(tf.stack([edge_out2]), axis=0) head_output, tail_output = tf.unstack(raw_edge, num=2, axis=0) tail_output_rev = tf.reverse(tail_output, tf.stack([1])) raw_edge_all = tf.reduce_mean(tf.stack([head_output, tail_output_rev]), axis=0) raw_edge_all = tf.expand_dims(raw_edge_all, dim=0) pred_edge = tf.sigmoid(raw_edge_all) res_edge = tf.cast(tf.greater(pred_edge, 0.5), tf.int32) # prepare ground truth preds = tf.reshape(pred_all, [-1,]) gt = tf.reshape(label_batch, [-1,]) weights = tf.cast(tf.less_equal(gt, N_CLASSES - 1), tf.int32) # Ignoring all labels greater than or equal to n_classes. mIoU, update_op_iou = tf.contrib.metrics.streaming_mean_iou(preds, gt, num_classes=N_CLASSES, weights=weights) macc, update_op_acc = tf.contrib.metrics.streaming_accuracy(preds, gt, weights=weights) # # Which variables to load. # restore_var = tf.global_variables() # # Set up tf session and initialize variables. # config = tf.ConfigProto() # config.gpu_options.allow_growth = True # # gpu_options=tf.GPUOptions(per_process_gpu_memory_fraction=0.7) # # config=tf.ConfigProto(gpu_options=gpu_options) # init = tf.global_variables_initializer() # evaluate prosessing parsing_dir = './output' # Set up tf session and initialize variables. config = tf.ConfigProto() config.gpu_options.allow_growth = True ``` 以上是初始化网络和初始化参数载入模型,下面定义两个函数分别处理val1.txt和val2.txt两个列表内部的数据。 ``` # 处理第一个列表函数 def humanParsing1(): # Which variables to load. restore_var = tf.global_variables() init = tf.global_variables_initializer() with tf.Session(config=config) as sess: sess.run(init) sess.run(tf.local_variables_initializer()) # Load weights. loader = tf.train.Saver(var_list=restore_var) if RESTORE_FROM is not None: if load(loader, sess, RESTORE_FROM): print(" [*] Load SUCCESS") else: print(" [!] Load failed...") # Create queue coordinator. coord = tf.train.Coordinator() # Start queue threads. threads = tf.train.start_queue_runners(coord=coord, sess=sess) # Iterate over training steps. for step in range(NUM_STEPS): # parsing_, scores, edge_ = sess.run([pred_all, pred_scores, pred_edge])# , update_op parsing_, scores, edge_ = sess.run([pred_all, pred_scores, pred_edge]) # , update_op print('step {:d}'.format(step)) print(image_list[step]) img_split = image_list[step].split('/') img_id = img_split[-1][:-4] msk = decode_labels(parsing_, num_classes=N_CLASSES) parsing_im = Image.fromarray(msk[0]) parsing_im.save('{}/{}_vis.png'.format(parsing_dir, img_id)) coord.request_stop() coord.join(threads) # 处理第二个列表函数 def humanParsing2(): # Set up tf session and initialize variables. config = tf.ConfigProto() config.gpu_options.allow_growth = True # gpu_options=tf.GPUOptions(per_process_gpu_memory_fraction=0.7) # config=tf.ConfigProto(gpu_options=gpu_options) # Which variables to load. restore_var = tf.global_variables() init = tf.global_variables_initializer() with tf.Session(config=config) as sess: # Create queue coordinator. coord = tf.train.Coordinator() sess.run(init) sess.run(tf.local_variables_initializer()) # Load weights. loader = tf.train.Saver(var_list=restore_var) if RESTORE_FROM is not None: if load(loader, sess, RESTORE_FROM): print(" [*] Load SUCCESS") else: print(" [!] Load failed...") LIST_PATH = './datasets/CIHP/list/val1.txt' DATA_ID_LIST = './datasets/CIHP/list/val_id1.txt' with open(DATA_ID_LIST, 'r') as f: NUM_STEPS = len(f.readlines()) # with tf.name_scope("create_inputs"): with tf.name_scope(scp1): tf.get_variable_scope().reuse_variables() reader = ImageReader(DATA_DIR, LIST_PATH, DATA_ID_LIST, None, False, False, False, coord) image, label, edge_gt = reader.image, reader.label, reader.edge image_rev = tf.reverse(image, tf.stack([1])) image_list = reader.image_list # Start queue threads. threads = tf.train.start_queue_runners(coord=coord, sess=sess) # Load weights. loader = tf.train.Saver(var_list=restore_var) if RESTORE_FROM is not None: if load(loader, sess, RESTORE_FROM): print(" [*] Load SUCCESS") else: print(" [!] Load failed...") # Iterate over training steps. for step in range(NUM_STEPS): # parsing_, scores, edge_ = sess.run([pred_all, pred_scores, pred_edge])# , update_op parsing_, scores, edge_ = sess.run([pred_all, pred_scores, pred_edge]) # , update_op print('step {:d}'.format(step)) print(image_list[step]) img_split = image_list[step].split('/') img_id = img_split[-1][:-4] msk = decode_labels(parsing_, num_classes=N_CLASSES) parsing_im = Image.fromarray(msk[0]) parsing_im.save('{}/{}_vis.png'.format(parsing_dir, img_id)) coord.request_stop() coord.join(threads) if __name__ == '__main__': humanParsing1() humanParsing2() ``` 最终输出结果一直是第一个列表里面的循环,代码上用了 self.queue = tf.train.slice_input_producer([self.images, self.labels, self.edges], shuffle=shuffle),队列的方式进行多线程推理。最终得到的结果一直是第一个列表的循环,求大神告诉问题怎么解决。

tensorflow训练过程权重不更新,loss不下降,输出保持不变,只有bias在非常缓慢地变化?

模型里没有参数被初始化为0 ,学习率从10的-5次方试到了0.1,输入数据都已经被归一化为了0-1之间,模型是改过的vgg16,有四个输出,使用了vgg16的预训练模型来初始化参数,输出中间结果也没有nan或者inf值。是不是不能自定义损失函数呢?但输出中间梯度发现并不是0,非常奇怪。 **train.py的部分代码** ``` def train(): x = tf.placeholder(tf.float32, [None, 182, 182, 2], name = 'image_input') y_ = tf.placeholder(tf.float32, [None, 8], name='label_input') global_step = tf.Variable(0, trainable=False) learning_rate = tf.train.exponential_decay(learning_rate=0.0001,decay_rate=0.9, global_step=TRAINING_STEPS, decay_steps=50,staircase=True) # 读取图片数据,pos是标签为1的图,neg是标签为0的图 pos, neg = get_data.get_image(img_path) #输入标签固定,输入数据每个batch前4张放pos,后4张放neg label_batch = np.reshape(np.array([1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0]),[1, 8]) vgg = vgg16.Vgg16() vgg.build(x) #loss函数的定义在后面 loss = vgg.side_loss( y_,vgg.output1, vgg.output2, vgg.output3, vgg.output4) train_step = tf.train.AdamOptimizer(learning_rate).minimize(loss, global_step=global_step) init_op = tf.global_variables_initializer() saver = tf.train.Saver() with tf.device('/gpu:0'): with tf.Session() as sess: sess.run(init_op) for i in range(TRAINING_STEPS): #在train.py的其他部分定义了batch_size= 4 start = i * batch_size end = start + batch_size #制作输入数据,前4个是标签为1的图,后4个是标签为0的图 image_list = [] image_list.append(pos[start:end]) image_list.append(neg[start:end]) image_batch = np.reshape(np.array(image_list),[-1,182,182,2]) _,loss_val,step = sess.run([train_step,loss,global_step], feed_dict={x: image_batch,y_:label_batch}) if i % 50 == 0: print("the step is %d,loss is %f" % (step, loss_val)) if loss_val < min_loss: min_loss = loss_val saver.save(sess, 'ckpt/vgg.ckpt', global_step=2000) ``` **Loss 函数的定义** ``` **loss函数的定义(写在了Vgg16类里)** ``` class Vgg16: #a,b,c,d都是vgg模型里的输出,是多输出模型 def side_loss(self,yi,a,b,c,d): self.loss1 = self.f_indicator(yi, a) self.loss2 = self.f_indicator(yi, b) self.loss3 = self.f_indicator(yi, c) self.loss_fuse = self.f_indicator(yi, d) self.loss_side = self.loss1 + self.loss2 + self.loss3 + self.loss_fuse res_loss = tf.reduce_sum(self.loss_side) return res_loss #损失函数的定义,标签为0时为log(1-yj),标签为1时为log(yj) def f_indicator(self,yi,yj): b = tf.where(yj>=1,yj*50,tf.abs(tf.log(tf.abs(1 - yj)))) res=tf.where(tf.equal(yi , 0.0), b,tf.abs(tf.log(tf.clip_by_value(yj, 1e-8, float("inf"))))) return res ```

在训练Tensorflow模型(object_detection)时,训练在第一次评估后退出,怎么使训练继续下去?

当我进行ssd模型训练时,训练进行了10分钟,然后进入评估阶段,评估之后程序就自动退出了,没有看到误和警告,这是为什么,怎么让程序一直训练下去? 训练命令: ``` python object_detection/model_main.py --pipeline_config_path=D:/gitcode/models/research/object_detection/ssd_mobilenet_v1_coco_2018_01_28/pipeline.config --model_dir=D:/gitcode/models/research/object_detection/ssd_mobilenet_v1_coco_2018_01_28/saved_model --num_train_steps=50000 --alsologtostderr ``` 配置文件: ``` training exit after the first evaluation(only one evaluation) in Tensorflow model(object_detection) without error and waring System information What is the top-level directory of the model you are using:models/research/object_detection/ Have I written custom code (as opposed to using a stock example script provided in TensorFlow):NO OS Platform and Distribution (e.g., Linux Ubuntu 16.04):Windows-10(64bit) TensorFlow installed from (source or binary):conda install tensorflow-gpu TensorFlow version (use command below):1.13.1 Bazel version (if compiling from source):N/A CUDA/cuDNN version:cudnn-7.6.0 GPU model and memory:GeForce GTX 1060 6GB Exact command to reproduce:See below my command for training : python object_detection/model_main.py --pipeline_config_path=D:/gitcode/models/research/object_detection/ssd_mobilenet_v1_coco_2018_01_28/pipeline.config --model_dir=D:/gitcode/models/research/object_detection/ssd_mobilenet_v1_coco_2018_01_28/saved_model --num_train_steps=50000 --alsologtostderr This is my config : train_config { batch_size: 24 data_augmentation_options { random_horizontal_flip { } } data_augmentation_options { ssd_random_crop { } } optimizer { rms_prop_optimizer { learning_rate { exponential_decay_learning_rate { initial_learning_rate: 0.00400000018999 decay_steps: 800720 decay_factor: 0.949999988079 } } momentum_optimizer_value: 0.899999976158 decay: 0.899999976158 epsilon: 1.0 } } fine_tune_checkpoint: "D:/gitcode/models/research/object_detection/ssd_mobilenet_v1_coco_2018_01_28/model.ckpt" from_detection_checkpoint: true num_steps: 200000 train_input_reader { label_map_path: "D:/gitcode/models/research/object_detection/idol/tf_label_map.pbtxt" tf_record_input_reader { input_path: "D:/gitcode/models/research/object_detection/idol/train/Iframe_??????.tfrecord" } } eval_config { num_examples: 8000 max_evals: 10 use_moving_averages: false } eval_input_reader { label_map_path: "D:/gitcode/models/research/object_detection/idol/tf_label_map.pbtxt" shuffle: false num_readers: 1 tf_record_input_reader { input_path: "D:/gitcode/models/research/object_detection/idol/eval/Iframe_??????.tfrecord" } ``` 窗口输出: (default) D:\gitcode\models\research>python object_detection/model_main.py --pipeline_config_path=D:/gitcode/models/research/object_detection/ssd_mobilenet_v1_coco_2018_01_28/pipeline.config --model_dir=D:/gitcode/models/research/object_detection/ssd_mobilenet_v1_coco_2018_01_28/saved_model --num_train_steps=50000 --alsologtostderr WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see: https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md https://github.com/tensorflow/addons If you depend on functionality not listed there, please file an issue. WARNING:tensorflow:Forced number of epochs for all eval validations to be 1. WARNING:tensorflow:Expected number of evaluation epochs is 1, but instead encountered eval_on_train_input_config.num_epochs = 0. Overwriting num_epochs to 1. WARNING:tensorflow:Estimator's model_fn (<function create_model_fn..model_fn at 0x0000027CBAB7BB70>) includes params argument, but params are not passed to Estimator. WARNING:tensorflow:From C:\Users\qian\Anaconda3\envs\default\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. WARNING:tensorflow:From C:\Users\qian\Anaconda3\envs\default\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\builders\dataset_builder.py:86: parallel_interleave (from tensorflow.contrib.data.python.ops.interleave_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.data.experimental.parallel_interleave(...). WARNING:tensorflow:From C:\Users\qian\Anaconda3\envs\default\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\core\preprocessor.py:196: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version. Instructions for updating: seed2 arg is deprecated.Use sample_distorted_bounding_box_v2 instead. WARNING:tensorflow:From C:\Users\qian\Anaconda3\envs\default\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\builders\dataset_builder.py:158: batch_and_drop_remainder (from tensorflow.contrib.data.python.ops.batching) is deprecated and will be removed in a future version. Instructions for updating: Use tf.data.Dataset.batch(..., drop_remainder=True). WARNING:tensorflow:From C:\Users\qian\Anaconda3\envs\default\lib\site-packages\tensorflow\python\ops\losses\losses_impl.py:448: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. WARNING:tensorflow:From C:\Users\qian\Anaconda3\envs\default\lib\site-packages\tensorflow\python\ops\array_grad.py:425: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. 2019-08-14 16:29:31.607841: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: name: GeForce GTX 1060 6GB major: 6 minor: 1 memoryClockRate(GHz): 1.7845 pciBusID: 0000:04:00.0 totalMemory: 6.00GiB freeMemory: 4.97GiB 2019-08-14 16:29:31.621836: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0 2019-08-14 16:29:32.275712: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-08-14 16:29:32.283072: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0 2019-08-14 16:29:32.288675: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N 2019-08-14 16:29:32.293514: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4714 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:04:00.0, compute capability: 6.1) WARNING:tensorflow:From C:\Users\qian\Anaconda3\envs\default\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\eval_util.py:796: to_int64 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.cast instead. WARNING:tensorflow:From C:\Users\qian\Anaconda3\envs\default\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\utils\visualization_utils.py:498: py_func (from tensorflow.python.ops.script_ops) is deprecated and will be removed in a future version. Instructions for updating: tf.py_func is deprecated in TF V2. Instead, use tf.py_function, which takes a python function which manipulates tf eager tensors instead of numpy arrays. It's easy to convert a tf eager tensor to an ndarray (just call tensor.numpy()) but having access to eager tensors means tf.py_functions can use accelerators such as GPUs as well as being differentiable using a gradient tape. 2019-08-14 16:41:44.736212: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0 2019-08-14 16:41:44.741242: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-08-14 16:41:44.747522: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0 2019-08-14 16:41:44.751256: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N 2019-08-14 16:41:44.755548: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4714 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1060 6GB, pci bus id: 0000:04:00.0, compute capability: 6.1) WARNING:tensorflow:From C:\Users\qian\Anaconda3\envs\default\lib\site-packages\tensorflow\python\training\saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file APIs to check for files with this prefix. creating index... index created! creating index... index created! Running per image evaluation... Evaluate annotation type bbox DONE (t=2.43s). Accumulating evaluation results... DONE (t=0.14s). Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.287 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.529 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.278 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.031 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.312 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.162 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.356 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.356 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.061 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.384 (default) D:\gitcode\models\research>

用tensorflow-gpu跑SSD-Mobilenet模型隔一段时间就会出现以下内容

![图片说明](https://img-ask.csdn.net/upload/201903/16/1552739746_369680.png) 我用的以下命令python object____detection/train.py --train_dir object_detection/train --pipeline_config_path object__detection/ssd__model/ssd_mobilenet_v1_pets.config___ 然后在object_detection 目录下没有见到train文件夹 这正常吗,我之前用CPU跑的时候很快就创建了train文件夹

Tensorflow测试训练styleGAN时报错 No OpKernel was registered to support Op 'NcclAllReduce' with these attrs.

在测试官方StyleGAN。 运行官方与训练模型pretrained_example.py generate_figures.py 没有问题。GPU工作正常。 运行train.py时报错 尝试只用单个GPU训练时没有报错。 NcclAllReduce应该跟多GPU通信有关,不太了解。 InvalidArgumentError (see above for traceback): No OpKernel was registered to support Op 'NcclAllReduce' with these attrs. Registered devices: [CPU,GPU], Registered kernels: <no registered kernels> [[Node: TrainD/SumAcrossGPUs/NcclAllReduce = NcclAllReduce[T=DT_FLOAT, num_devices=2, reduction="sum", shared_name="c112", _device="/device:GPU:0"](GPU0/TrainD_grad/gradients/AddN_160)]] 经过多番google 尝试过 重启 conda install keras-gpu 重新安装tensorflow-gpu==1.10.0(跟官方版本保持一致) ``` …… Building TensorFlow graph... Setting up snapshot image grid... Setting up run dir... Training... Traceback (most recent call last): File "d:\Users\admin\Anaconda3\envs\tfenv\lib\site-packages\tensorflow\python\client\session.py", line 1278, in _do_call return fn(*args) File "d:\Users\admin\Anaconda3\envs\tfenv\lib\site-packages\tensorflow\python\client\session.py", line 1263, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "d:\Users\admin\Anaconda3\envs\tfenv\lib\site-packages\tensorflow\python\client\session.py", line 1350, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'NcclAllReduce' with these attrs. Registered devices: [CPU,GPU], Registered kernels: <no registered kernels> [[Node: TrainD/SumAcrossGPUs/NcclAllReduce = NcclAllReduce[T=DT_FLOAT, num_devices=2, reduction="sum", shared_name="c112", _device="/device:GPU:0"](GPU0/TrainD_grad/gradients/AddN_160)]] During handling of the above exception, another exception occurred: Traceback (most recent call last): File "train.py", line 191, in <module> main() File "train.py", line 186, in main dnnlib.submit_run(**kwargs) File "E:\MachineLearning\stylegan-master\dnnlib\submission\submit.py", line 290, in submit_run run_wrapper(submit_config) File "E:\MachineLearning\stylegan-master\dnnlib\submission\submit.py", line 242, in run_wrapper util.call_func_by_name(func_name=submit_config.run_func_name, submit_config=submit_config, **submit_config.run_func_kwargs) File "E:\MachineLearning\stylegan-master\dnnlib\util.py", line 257, in call_func_by_name return func_obj(*args, **kwargs) File "E:\MachineLearning\stylegan-master\training\training_loop.py", line 230, in training_loop tflib.run([D_train_op, Gs_update_op], {lod_in: sched.lod, lrate_in: sched.D_lrate, minibatch_in: sched.minibatch}) File "E:\MachineLearning\stylegan-master\dnnlib\tflib\tfutil.py", line 26, in run return tf.get_default_session().run(*args, **kwargs) File "d:\Users\admin\Anaconda3\envs\tfenv\lib\site-packages\tensorflow\python\client\session.py", line 877, in run run_metadata_ptr) File "d:\Users\admin\Anaconda3\envs\tfenv\lib\site-packages\tensorflow\python\client\session.py", line 1100, in _run feed_dict_tensor, options, run_metadata) File "d:\Users\admin\Anaconda3\envs\tfenv\lib\site-packages\tensorflow\python\client\session.py", line 1272, in _do_run run_metadata) File "d:\Users\admin\Anaconda3\envs\tfenv\lib\site-packages\tensorflow\python\client\session.py", line 1291, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'NcclAllReduce' with these attrs. Registered devices: [CPU,GPU], Registered kernels: <no registered kernels> [[Node: TrainD/SumAcrossGPUs/NcclAllReduce = NcclAllReduce[T=DT_FLOAT, num_devices=2, reduction="sum", shared_name="c112", _device="/device:GPU:0"](GPU0/TrainD_grad/gradients/AddN_160)]] Caused by op 'TrainD/SumAcrossGPUs/NcclAllReduce', defined at: File "train.py", line 191, in <module> main() File "train.py", line 186, in main dnnlib.submit_run(**kwargs) File "E:\MachineLearning\stylegan-master\dnnlib\submission\submit.py", line 290, in submit_run run_wrapper(submit_config) File "E:\MachineLearning\stylegan-master\dnnlib\submission\submit.py", line 242, in run_wrapper util.call_func_by_name(func_name=submit_config.run_func_name, submit_config=submit_config, **submit_config.run_func_kwargs) File "E:\MachineLearning\stylegan-master\dnnlib\util.py", line 257, in call_func_by_name return func_obj(*args, **kwargs) File "E:\MachineLearning\stylegan-master\training\training_loop.py", line 185, in training_loop D_train_op = D_opt.apply_updates() File "E:\MachineLearning\stylegan-master\dnnlib\tflib\optimizer.py", line 135, in apply_updates g = nccl_ops.all_sum(g) File "d:\Users\admin\Anaconda3\envs\tfenv\lib\site-packages\tensorflow\contrib\nccl\python\ops\nccl_ops.py", line 49, in all_sum return _apply_all_reduce('sum', tensors) File "d:\Users\admin\Anaconda3\envs\tfenv\lib\site-packages\tensorflow\contrib\nccl\python\ops\nccl_ops.py", line 230, in _apply_all_reduce shared_name=shared_name)) File "d:\Users\admin\Anaconda3\envs\tfenv\lib\site-packages\tensorflow\contrib\nccl\ops\gen_nccl_ops.py", line 59, in nccl_all_reduce num_devices=num_devices, shared_name=shared_name, name=name) File "d:\Users\admin\Anaconda3\envs\tfenv\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "d:\Users\admin\Anaconda3\envs\tfenv\lib\site-packages\tensorflow\python\util\deprecation.py", line 454, in new_func return func(*args, **kwargs) File "d:\Users\admin\Anaconda3\envs\tfenv\lib\site-packages\tensorflow\python\framework\ops.py", line 3156, in create_op op_def=op_def) File "d:\Users\admin\Anaconda3\envs\tfenv\lib\site-packages\tensorflow\python\framework\ops.py", line 1718, in __init__ self._traceback = tf_stack.extract_stack() InvalidArgumentError (see above for traceback): No OpKernel was registered to support Op 'NcclAllReduce' with these attrs. Registered devices: [CPU,GPU], Registered kernels: <no registered kernels> [[Node: TrainD/SumAcrossGPUs/NcclAllReduce = NcclAllReduce[T=DT_FLOAT, num_devices=2, reduction="sum", shared_name="c112", _device="/device:GPU:0"](GPU0/TrainD_grad/gradients/AddN_160)]] ``` ``` #conda list: # Name Version Build Channel _tflow_select 2.1.0 gpu absl-py 0.8.1 pypi_0 pypi alabaster 0.7.12 py36_0 asn1crypto 1.2.0 py36_0 astor 0.8.0 pypi_0 pypi astroid 2.3.2 py36_0 attrs 19.3.0 py_0 babel 2.7.0 py_0 backcall 0.1.0 py36_0 blas 1.0 mkl bleach 3.1.0 py36_0 ca-certificates 2019.10.16 0 certifi 2019.9.11 py36_0 cffi 1.13.1 py36h7a1dbc1_0 chardet 3.0.4 py36_1003 cloudpickle 1.2.2 py_0 colorama 0.4.1 py36_0 cryptography 2.8 py36h7a1dbc1_0 cudatoolkit 9.0 1 cudnn 7.6.4 cuda9.0_0 decorator 4.4.1 py_0 defusedxml 0.6.0 py_0 django 2.2.7 pypi_0 pypi docutils 0.15.2 py36_0 entrypoints 0.3 py36_0 gast 0.3.2 py_0 grpcio 1.25.0 pypi_0 pypi h5py 2.9.0 py36h5e291fa_0 hdf5 1.10.4 h7ebc959_0 icc_rt 2019.0.0 h0cc432a_1 icu 58.2 ha66f8fd_1 idna 2.8 pypi_0 pypi image 1.5.27 pypi_0 pypi imagesize 1.1.0 py36_0 importlib_metadata 0.23 py36_0 intel-openmp 2019.4 245 ipykernel 5.1.3 py36h39e3cac_0 ipython 7.9.0 py36h39e3cac_0 ipython_genutils 0.2.0 py36h3c5d0ee_0 isort 4.3.21 py36_0 jedi 0.15.1 py36_0 jinja2 2.10.3 py_0 jpeg 9b hb83a4c4_2 jsonschema 3.1.1 py36_0 jupyter_client 5.3.4 py36_0 jupyter_core 4.6.1 py36_0 keras-applications 1.0.8 py_0 keras-base 2.2.4 py36_0 keras-gpu 2.2.4 0 keras-preprocessing 1.1.0 py_1 keyring 18.0.0 py36_0 lazy-object-proxy 1.4.3 py36he774522_0 libpng 1.6.37 h2a8f88b_0 libprotobuf 3.9.2 h7bd577a_0 libsodium 1.0.16 h9d3ae62_0 markdown 3.1.1 py36_0 markupsafe 1.1.1 py36he774522_0 mccabe 0.6.1 py36_1 mistune 0.8.4 py36he774522_0 mkl 2019.4 245 mkl-service 2.3.0 py36hb782905_0 mkl_fft 1.0.15 py36h14836fe_0 mkl_random 1.1.0 py36h675688f_0 more-itertools 7.2.0 py36_0 nbconvert 5.6.1 py36_0 nbformat 4.4.0 py36h3a5bc1b_0 numpy 1.17.3 py36h4ceb530_0 numpy-base 1.17.3 py36hc3f5095_0 numpydoc 0.9.1 py_0 openssl 1.1.1d he774522_3 packaging 19.2 py_0 pandoc 2.2.3.2 0 pandocfilters 1.4.2 py36_1 parso 0.5.1 py_0 pickleshare 0.7.5 py36_0 pillow 6.2.1 pypi_0 pypi pip 19.3.1 py36_0 prompt_toolkit 2.0.10 py_0 protobuf 3.10.0 pypi_0 pypi psutil 5.6.3 py36he774522_0 pycodestyle 2.5.0 py36_0 pycparser 2.19 py36_0 pyflakes 2.1.1 py36_0 pygments 2.4.2 py_0 pylint 2.4.3 py36_0 pyopenssl 19.0.0 py36_0 pyparsing 2.4.2 py_0 pyqt 5.9.2 py36h6538335_2 pyreadline 2.1 py36_1 pyrsistent 0.15.4 py36he774522_0 pysocks 1.7.1 py36_0 python 3.6.9 h5500b2f_0 python-dateutil 2.8.1 py_0 pytz 2019.3 py_0 pywin32 223 py36hfa6e2cd_1 pyyaml 5.1.2 py36he774522_0 pyzmq 18.1.0 py36ha925a31_0 qt 5.9.7 vc14h73c81de_0 qtawesome 0.6.0 py_0 qtconsole 4.5.5 py_0 qtpy 1.9.0 py_0 requests 2.22.0 py36_0 rope 0.14.0 py_0 scipy 1.3.1 py36h29ff71c_0 setuptools 39.1.0 pypi_0 pypi sip 4.19.8 py36h6538335_0 six 1.13.0 pypi_0 pypi snowballstemmer 2.0.0 py_0 sphinx 2.2.1 py_0 sphinxcontrib-applehelp 1.0.1 py_0 sphinxcontrib-devhelp 1.0.1 py_0 sphinxcontrib-htmlhelp 1.0.2 py_0 sphinxcontrib-jsmath 1.0.1 py_0 sphinxcontrib-qthelp 1.0.2 py_0 sphinxcontrib-serializinghtml 1.1.3 py_0 spyder 3.3.6 py36_0 spyder-kernels 0.5.2 py36_0 sqlite 3.30.1 he774522_0 sqlparse 0.3.0 pypi_0 pypi tensorboard 1.10.0 py36he025d50_0 tensorflow 1.10.0 gpu_py36h3514669_0 tensorflow-base 1.10.0 gpu_py36h6e53903_0 tensorflow-gpu 1.10.0 pypi_0 pypi termcolor 1.1.0 pypi_0 pypi testpath 0.4.2 py36_0 tornado 6.0.3 py36he774522_0 traitlets 4.3.3 py36_0 typed-ast 1.4.0 py36he774522_0 urllib3 1.25.6 pypi_0 pypi vc 14.1 h0510ff6_4 vs2015_runtime 14.16.27012 hf0eaf9b_0 wcwidth 0.1.7 py36h3d5aa90_0 webencodings 0.5.1 py36_1 werkzeug 0.16.0 py_0 wheel 0.33.6 py36_0 win_inet_pton 1.1.0 py36_0 wincertstore 0.2 py36h7fe50ca_0 wrapt 1.11.2 py36he774522_0 yaml 0.1.7 hc54c509_2 zeromq 4.3.1 h33f27b4_3 zipp 0.6.0 py_0 zlib 1.2.11 h62dcd97_3 ``` 2*RTX2080Ti driver 4.19.67

训练SSD_mobilenet_v1_coco模型时tensorflow出现以下报错,请问原因是什么 怎么解决?

tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0] = 0 is not in [0, 0) [[{{node GatherV2_4}}]] [[node IteratorGetNext (defined at C:\Users\lenovo\Anaconda3\lib\site-packages\tensorflow_estimator\python\estimator\util.py:110) ]] ``` ```

有没有人试过在自己工程里调用tensorflow C++ API

最近想在自己的工程里调用tensorflow c++API,查了网上的博客,都是一些demo就是单独用来做预测的makefile 然而我的工程里本身就有makefile,那这样的话如何让两个makefile结合到一起

使用Keras找不到tensorflow

程序代码 #-*- coding: utf-8 -*- #使用神经网络算法预测销量高低 import pandas as pd #参数初始化 inputfile = 'D:/python/chapter5/demo/data/sales_data.xls' data = pd.read_excel(inputfile, index_col = u'序号') #导入数据 #数据是类别标签,要将它转换为数据 #用1来表示“好”、“是”、“高”这三个属性,用0来表示“坏”、“否”、“低” data[data == u'好'] = 1 data[data == u'是'] = 1 data[data == u'高'] = 1 data[data != 1] = 0 x = data.iloc[:,:3].as_matrix().astype(int) y = data.iloc[:,3].as_matrix().astype(int) from keras.models import Sequential from keras.layers.core import Dense, Activation model = Sequential() #建立模型 model.add(Dense(input_dim = 3, output_dim = 10)) model.add(Activation('relu')) #用relu函数作为激活函数,能够大幅提供准确度 model.add(Dense(input_dim = 10, output_dim = 1)) model.add(Activation('sigmoid')) #由于是0-1输出,用sigmoid函数作为激活函数 model.compile(loss = 'binary_crossentropy', optimizer = 'adam', class_mode = 'binary') #编译模型。由于我们做的是二元分类,所以我们指定损失函数为binary_crossentropy,以及模式为binary #另外常见的损失函数还有mean_squared_error、categorical_crossentropy等,请阅读帮助文件。 #求解方法我们指定用adam,还有sgd、rmsprop等可选 model.fit(x, y, nb_epoch = 1000, batch_size = 10) #训练模型,学习一千次 yp = model.predict_classes(x).reshape(len(y)) #分类预测 from cm_plot import * #导入自行编写的混淆矩阵可视化函数 cm_plot(y,yp).show() #显示混淆矩阵可视化结果 错误提示 Using TensorFlow backend. Traceback (most recent call last): File "D:\python\chapter5\demo\code\5-3_neural_network.py", line 19, in <module> from keras.models import Sequential File "C:\Python27\lib\site-packages\keras\__init__.py", line 3, in <module> from . import utils File "C:\Python27\lib\site-packages\keras\utils\__init__.py", line 6, in <module> from . import conv_utils File "C:\Python27\lib\site-packages\keras\utils\conv_utils.py", line 3, in <module> from .. import backend as K File "C:\Python27\lib\site-packages\keras\backend\__init__.py", line 83, in <module> from .tensorflow_backend import * File "C:\Python27\lib\site-packages\keras\backend\tensorflow_backend.py", line 1, in <module> import tensorflow as tf ImportError: No module named tensorflow

保存使用keras训练的TF模型,然后在Go中进行评估

<div class="post-text" itemprop="text"> <p>I'm trying to setup a classical MNIST challenge model with <code>keras</code>, then save the <code>tensorflow</code> graph and subsequently load it in <code>Go</code> and evaluate with some input. I've been following <a href="https://nilsmagnus.github.io/post/go-tensorflow/" rel="nofollow noreferrer">this article</a> which supplies full code on <a href="https://github.com/nilsmagnus/tensorflow-with-go" rel="nofollow noreferrer">github</a>. Nils is using just tensorflow to setup the comp.graph but I would like to use <code>keras</code>. I managd to save the model the same way as he does </p> <p>model:</p> <pre><code> model = Sequential() model.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28,28,1), name="inputNode")) model.add(Conv2D(64, (3, 3), activation='relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) model.add(Flatten()) model.add(Dense(128, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(10, activation='softmax', name="inferNode")) </code></pre> <p>which runs ok, trains and evaluates and then saving as posted above:</p> <pre><code>builder = tf.saved_model.builder.SavedModelBuilder("mnistmodel_my") # GOLANG note that we must tag our model so that we can retrieve it at inference-time builder.add_meta_graph_and_variables(sess, ["serve"]) builder.save() </code></pre> <p>Which I then try to evaluate as :</p> <pre><code>result, runErr := model.Session.Run( map[tf.Output]*tf.Tensor{ model.Graph.Operation("inputNode").Output(0): tensor, }, []tf.Output{ model.Graph.Operation("inferNode").Output(0), }, nil, ) </code></pre> <p>In Go I follow the example but when evaluating, I get:</p> <pre><code> panic: nil-Operation. If the Output was created with a Scope object, see Scope.Err() for details. goroutine 1 [running]: github.com/tensorflow/tensorflow/tensorflow/go.Output.c(0x0, 0x0, 0x0, 0x0) /Users/air/go/src/github.com/tensorflow/tensorflow/tensorflow/go/operation.go:119 +0xbb github.com/tensorflow/tensorflow/tensorflow/go.newCRunArgs(0xc42006e210, 0xc420047ef0, 0x1, 0x1, 0x0, 0x0, 0x0, 0xc4200723c8) /Users/air/go/src/github.com/tensorflow/tensorflow/tensorflow/go/session.go:307 +0x22d github.com/tensorflow/tensorflow/tensorflow/go.(*Session).Run(0xc420078060, 0xc42006e210, 0xc420047ef0, 0x1, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0, ...) /Users/air/go/src/github.com/tensorflow/tensorflow/tensorflow/go/session.go:85 +0x153 main.main() /Users/air/PycharmProjects/GoTensor/custom.go:36 +0x341 exit status 2 </code></pre> <p>Since it says <code>nil-Operation</code> I think I might have incorrectly labelled the nodes. But I don't know which other nodes should I then label?</p> <p>Many thanks!!!</p> </div>

Bert训练模型时报错:Windows fatal exception: access violation

运行之后: ``` 2020-04-28 23:26:44.729208: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll Windows fatal exception: access violation Current thread 0x00004a90 (most recent call first): File "C:\Users\DELL\Anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow_core\python\lib\io\file_io.py", line 84 in _preread_check File "C:\Users\DELL\Anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow_core\python\lib\io\file_io.py", line 122 in read File "D:\senti\code\Bert\modeling.py", line 94 in from_json_file File "D:/senti/code/Bert/run_classifier.py", line 844 in main File "C:\Users\DELL\Anaconda3\envs\tensorflow-gpu\lib\site-packages\absl\app.py", line 250 in _run_main File "C:\Users\DELL\Anaconda3\envs\tensorflow-gpu\lib\site-packages\absl\app.py", line 299 in run File "C:\Users\DELL\Anaconda3\envs\tensorflow-gpu\lib\site-packages\tensorflow_core\python\platform\app.py", line 40 in run File "D:/senti/code/Bert/run_classifier.py", line 1025 in <module> Process finished with exit code -1073741819 (0xC0000005) ```

tensorflow,python运行时报错在reshape上,求大神解答

代码来自于一篇博客,用tensorflow判断拨号图标和短信图标的分类,训练已经成功运行,以下为测试代码,错误出现在38行 image = tf.reshape(image, [1, 208, 208, 3]) 我的测试图片是256*256的,也测试了48*48的 ``` #!/usr/bin/python # -*- coding:utf-8 -*- # @Time : 2018/3/31 0031 17:50 # @Author : scw # @File : main.py # 进行图片预测方法调用的文件 import numpy as np from PIL import Image import tensorflow as tf import matplotlib.pyplot as plt from shenjingwangluomodel import inference # 定义需要进行分类的种类,这里我是进行分两种,因为一种为是拨号键,另一种就是非拨号键 CallPhoneStyle = 2 # 进行测试的操作处理========================== # 加载要进行测试的图片 def get_one_image(img_dir): image = Image.open(img_dir) # Image.open() # 好像一次只能打开一张图片,不能一次打开一个文件夹,这里大家可以去搜索一下 plt.imshow(image) image = image.resize([208, 208]) image_arr = np.array(image) return image_arr # 进行测试处理------------------------------------------------------- def test(test_file): # 设置加载训练结果的文件目录(这个是需要之前就已经训练好的,别忘记) log_dir = '/home/administrator/test_system/calldata2/' # 打开要进行测试的图片 image_arr = get_one_image(test_file) with tf.Graph().as_default(): # 把要进行测试的图片转为tensorflow所支持的格式 image = tf.cast(image_arr, tf.float32) # 将图片进行格式化的处理 image = tf.image.per_image_standardization(image) # 将tensorflow的图片的格式参数,转变为shape格式的,好像就是去掉-1这样的列表 image = tf.reshape(image, [1, 208, 208, 3]) # print(image.shape) # 参数CallPhoneStyle:表示的是分为两类 p = inference(image, 1, CallPhoneStyle) # 这是训练出一个神经网络的模型 # 这里用到了softmax这个逻辑回归模型的处理 logits = tf.nn.softmax(p) x = tf.placeholder(tf.float32, shape=[208, 208, 3]) saver = tf.train.Saver() with tf.Session() as sess: # 对tensorflow的训练参数进行初始化,使用默认的方式 sess.run(tf.global_variables_initializer()) # 判断是否有进行训练模型的设置,所以一定要之前就进行了模型的训练 ckpt = tf.train.get_checkpoint_state(log_dir) if ckpt and ckpt.model_checkpoint_path: # global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1] saver.restore(sess, ckpt.model_checkpoint_path) # 调用saver.restore()函数,加载训练好的网络模型 print('Loading success') else: print('No checkpoint') prediction = sess.run(logits, feed_dict={x: image_arr}) max_index = np.argmax(prediction) print('预测的标签为:') if max_index == 0: print("是拨号键图片") else: print("是短信图片") # print(max_index) print('预测的分类结果每种的概率为:') print(prediction) # 我用0,1表示两种分类,这也是我在训练的时候就设置好的 if max_index == 0: print('图片是拨号键图标的概率为 %.6f' %prediction[:, 0]) elif max_index == 1: print('图片是短信它图标的概率为 %.6f' %prediction[:, 1]) # 进行图片预测 test('/home/administrator/Downloads/def.png') ''' # 测试自己的训练集的图片是不是已经加载成功(因为这个是进行训练的第一步) train_dir = 'E:/tensorflowdata/calldata/' BATCH_SIZE = 5 # 生成批次队列中的容量缓存的大小 CAPACITY = 256 # 设置我自己要对图片进行统一大小的高和宽 IMG_W = 208 IMG_H = 208 image_list,label_list = get_files(train_dir) # 加载训练集的图片和对应的标签 image_batch,label_batch = get_batch(image_list,label_list,IMG_W,IMG_H,BATCH_SIZE,CAPACITY) # 进行批次图片加载到内存中 # 这是打开一个session,主要是用于进行图片的显示效果的测试 with tf.Session() as sess: i = 0 coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(coord=coord) try: while not coord.should_stop() and i < 2: # 提取出两个batch的图片并可视化。 img, label = sess.run([image_batch, label_batch]) for j in np.arange(BATCH_SIZE): print('label: %d' % label[j]) plt.imshow(img[j, :, :, :]) plt.show() i += 1 except tf.errors.OutOfRangeError: print('done!') finally: coord.request_stop() coord.join(threads) ''' ```

为什么同样的问题用Tensorflow和keras实现结果不一样?

**cifar-10分类问题,同样的模型结构以及损失函数还有学习率参数等超参数,分别用TensorFlow和keras实现。 20个epochs后在测试集上进行预测,准确率总是差好几个百分点,不知道问题出在哪里?代码如下: 这个是TF的代码:** import tensorflow as tf import numpy as np import pickle as pk tf.reset_default_graph() batch_size = 64 test_size = 10000 img_size = 32 num_classes = 10 training_epochs = 10 test_size=200 ############################################################################### def unpickle(filename): '''解压数据''' with open(filename, 'rb') as f: d = pk.load(f, encoding='latin1') return d def onehot(labels): '''one-hot 编码''' n_sample = len(labels) n_class = max(labels) + 1 onehot_labels = np.zeros((n_sample, n_class)) onehot_labels[np.arange(n_sample), labels] = 1 return onehot_labels # 训练数据集 data1 = unpickle('data_batch_1') data2 = unpickle('data_batch_2') data3 = unpickle('data_batch_3') data4 = unpickle('data_batch_4') data5 = unpickle('data_batch_5') X_train = np.concatenate((data1['data'], data2['data'], data3['data'], data4['data'], data5['data']), axis=0)/255.0 y_train = np.concatenate((data1['labels'], data2['labels'], data3['labels'], data4['labels'], data5['labels']), axis=0) y_train = onehot(y_train) # 测试数据集 test = unpickle('test_batch') X_test = test['data']/255.0 y_test = onehot(test['labels']) del test,data1,data2,data3,data4,data5 ############################################################################### w = tf.Variable(tf.random_normal([5, 5, 3, 32], stddev=0.01)) w_c= tf.Variable(tf.random_normal([32* 16* 16, 512], stddev=0.1)) w_o =tf.Variable(tf.random_normal([512, num_classes], stddev=0.1)) def init_bias(shape): return tf.Variable(tf.constant(0.0, shape=shape)) b=init_bias([32]) b_c=init_bias([512]) b_o=init_bias([10]) def model(X, w, w_c,w_o, p_keep_conv, p_keep_hidden,b,b_c,b_o): conv1 = tf.nn.conv2d(X, w,strides=[1, 1, 1, 1],padding='SAME')#32x32x32 conv1=tf.nn.bias_add(conv1,b) conv1 = tf.nn.relu(conv1) conv1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1],strides=[1, 2, 2, 1],padding='SAME')#16x16x32 conv1 = tf.nn.dropout(conv1, p_keep_conv) FC_layer = tf.reshape(conv1, [-1, 32 * 16 * 16]) out_layer=tf.matmul(FC_layer, w_c)+b_c out_layer=tf.nn.relu(out_layer) out_layer = tf.nn.dropout(out_layer, p_keep_hidden) result = tf.matmul(out_layer, w_o)+b_o return result trX, trY, teX, teY = X_train,y_train,X_test,y_test trX = trX.reshape(-1, img_size, img_size, 3) teX = teX.reshape(-1, img_size, img_size, 3) X = tf.placeholder("float", [None, img_size, img_size, 3]) Y = tf.placeholder("float", [None, num_classes]) p_keep_conv = tf.placeholder("float") p_keep_hidden = tf.placeholder("float") py_x = model(X, w, w_c,w_o, p_keep_conv, p_keep_hidden,b,b_c,b_o) Y_ = tf.nn.softmax_cross_entropy_with_logits_v2(logits=py_x, labels=Y) cost = tf.reduce_mean(Y_) optimizer = tf.train.RMSPropOptimizer(0.001, 0.9).minimize(cost) predict_op = tf.argmax(py_x, 1) with tf.Session() as sess: tf.global_variables_initializer().run() for i in range(training_epochs): training_batch = zip(range(0, len(trX),batch_size),range(batch_size, len(trX)+1,batch_size)) perm=np.arange(len(trX)) np.random.shuffle(perm) trX=trX[perm] trY=trY[perm] for start, end in training_batch: sess.run(optimizer, feed_dict={X: trX[start:end],Y: trY[start:end],p_keep_conv:0.75,p_keep_hidden: 0.5}) test_batch = zip(range(0, len(teX),test_size),range(test_size, len(teX)+1,test_size)) accuracyResult=0 for start, end in test_batch: accuracyResult=accuracyResult+sum(np.argmax(teY[start:end], axis=1) ==sess.run(predict_op, feed_dict={X: teX[start:end],Y: teY[start:end],p_keep_conv: 1,p_keep_hidden: 1})) print(i, accuracyResult/10000) **这个是keras代码:** from keras import initializers from keras.datasets import cifar10 from keras.utils import np_utils from keras.models import Sequential from keras.layers.core import Dense, Dropout, Activation, Flatten from keras.layers.convolutional import Conv2D, MaxPooling2D from keras.optimizers import SGD, Adam, RMSprop #import matplotlib.pyplot as plt # CIFAR_10 is a set of 60K images 32x32 pixels on 3 channels IMG_CHANNELS = 3 IMG_ROWS = 32 IMG_COLS = 32 #constant BATCH_SIZE = 64 NB_EPOCH = 10 NB_CLASSES = 10 VERBOSE = 1 VALIDATION_SPLIT = 0 OPTIM = RMSprop() #load dataset (X_train, y_train), (X_test, y_test) = cifar10.load_data() #print('X_train shape:', X_train.shape) #print(X_train.shape[0], 'train samples') #print(X_test.shape[0], 'test samples') # convert to categorical Y_train = np_utils.to_categorical(y_train, NB_CLASSES) Y_test = np_utils.to_categorical(y_test, NB_CLASSES) # float and normalization X_train = X_train.astype('float32') X_test = X_test.astype('float32') X_train /= 255 X_test /= 255 # network model = Sequential() model.add(Conv2D(32, (3, 3), padding='same',input_shape=(IMG_ROWS, IMG_COLS, IMG_CHANNELS),kernel_initializer=initializers.random_normal(stddev=0.01),bias_initializer=initializers.Zeros())) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) #0<参数<1才会有用 model.add(Flatten()) model.add(Dense(512,kernel_initializer=initializers.random_normal(stddev=0.1),bias_initializer=initializers.Zeros())) model.add(Activation('relu')) model.add(Dropout(0.5)) model.add(Dense(NB_CLASSES,kernel_initializer=initializers.random_normal(stddev=0.1),bias_initializer=initializers.Zeros())) model.add(Activation('softmax')) model.summary() # train model.compile(loss='categorical_crossentropy', optimizer=OPTIM,metrics=['accuracy']) model.fit(X_train, Y_train, batch_size=BATCH_SIZE,epochs=NB_EPOCH, validation_split=VALIDATION_SPLIT,verbose=VERBOSE) score = model.evaluate(X_test, Y_test,batch_size=200, verbose=VERBOSE) print("Test score:", score[0]) print('Test accuracy:', score[1])

tensorflow 在feed的时候报错could not convert string to float

报错ValueError: could not convert string to float: 'C:/Users/Administrator/Desktop/se//12.21/FV-USM/Published_database_FV-USM_Dec2013/1st_session/extractedvein/vein001_1/01.jpg' 以下是我的代码 ```#!/usr/bin/env python # -*- coding:utf-8 -*- import tensorflow as tf from PIL import Image import model import numpy as np import os def get_one_image(): i=1 n=1 k=1 i = str(i) n = str(n) k = str(k) a = i.zfill(3) b = n.zfill(1) c = k.zfill(2) files='C:/Users/Administrator/Desktop/se//12.21/FV-USM/Published_database_FV-USM_Dec2013/1st_session/extractedvein/vein' + a + '_' + b + '/' + c + '.jpg' # image=image.resize([60,175]) # image=np.array(image) return files def evaluate_one_image(): image_array = get_one_image() with tf.Graph().as_default(): BATCH_SIZE=1 N_CLASS=123 image = tf.image.decode_jpeg(image_array, channels=1) image = tf.image.resize_image_with_crop_or_pad(image,60, 175) image = tf.image.per_image_standardization(image) ###图片调整完成 # image=tf.cast(image,tf.string) # image1 = tf.image.decode_jpeg(image, channels=1) # image1 = tf.image.resize_image_with_crop_or_pad(image1, image_w, image_h) #image1 = tf.image.per_image_standardization(image) ###图片调整完成 #image=tf.image.per_image_standardization(image) image = tf.cast(image, tf.float32) image=tf.reshape(image,[1,60,175,1]) logit = model.inference(image, BATCH_SIZE, N_CLASS) logit=tf.nn.softmax(logit) # 因为 inference 的返回没有用激活函数,所以在这里对结果用softmax 激活。但是为什么要激活?? x=tf.placeholder(tf.float32,shape=[60,175,1]) logs_model_dir='C:/Users/Administrator/Desktop/python的神经网络/model/' saver=tf.train.Saver() with tf.Session() as sess: print("从指定路径加载模型。。。") ckpt=tf.train.get_checkpoint_state(logs_model_dir) if ckpt and ckpt.model_checkpoint_path: global_step=ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1] saver.restore(sess,ckpt.model_checkpoint_path) print('模型加载成功,训练步数为 %s'%global_step) else: print('模型加载失败,,,,文件没有找到') ###IMG = sess.run(image) prediction=sess.run(logit,feed_dict={x:image_array}) max_index=np.argmax(prediction) print(max_index) evaluate_one_image() ```

BERT模型训练报错:IndexError: list index out of range,求大佬指教!

![图片说明](https://img-ask.csdn.net/upload/202004/29/1588175660_746755.png) 运行结果: ``` C:\Users\DELL\Anaconda3\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\framework\dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) C:\Users\DELL\Anaconda3\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\framework\dtypes.py:524: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) C:\Users\DELL\Anaconda3\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\framework\dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) C:\Users\DELL\Anaconda3\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\framework\dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) C:\Users\DELL\Anaconda3\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\framework\dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) C:\Users\DELL\Anaconda3\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\framework\dtypes.py:532: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) Traceback (most recent call last): File "D:/senti/code/Bert/run_classifier.py", line 1024, in <module> tf.app.run() File "C:\Users\DELL\Anaconda3\envs\tensorflow_gpu\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run _sys.exit(main(argv)) File "D:/senti/code/Bert/run_classifier.py", line 885, in main train_examples = processor.get_train_examples(FLAGS.data_dir) File "D:/senti/code/Bert/run_classifier.py", line 385, in get_train_examples self._read_tsv(os.path.join(data_dir, "train.tsv")), "train") File "D:/senti/code/Bert/run_classifier.py", line 408, in _create_examples text_a = tokenization.convert_to_unicode(line[1]) IndexError: list index out of range ```

基于tensorflow的pix2pix代码中如何做到输入图像和输出图像分辨率不一致

问题:例如在自己制作了成对的输入(input256×256 target 200×256)后,如何让输入图像和输出图像分辨率不一致,例如成对图像中:input的分辨率是256×256, output 和target都是200×256,需要修改哪里的参数。 论文参考:《Image-to-Image Translation with Conditional Adversarial Networks》 代码参考:https://blog.csdn.net/MOU_IT/article/details/80802407?utm_source=blogxgwz0 # coding=utf-8 from __future__ import absolute_import from __future__ import division from __future__ import print_function import os os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' import tensorflow as tf import numpy as np import os import glob import random import collections import math import time # https://github.com/affinelayer/pix2pix-tensorflow train_input_dir = "D:/Project/pix2pix-tensorflow-master/facades/train/" # 训练集输入 train_output_dir = "D:/Project/pix2pix-tensorflow-master/facades/train_out/" # 训练集输出 test_input_dir = "D:/Project/pix2pix-tensorflow-master/facades/val/" # 测试集输入 test_output_dir = "D:/Project/pix2pix-tensorflow-master/facades/test_out/" # 测试集的输出 checkpoint = "D:/Project/pix2pix-tensorflow-master/facades/train_out/" # 保存结果的目录 seed = None max_steps = None # number of training steps (0 to disable) max_epochs = 200 # number of training epochs progress_freq = 50 # display progress every progress_freq steps trace_freq = 0 # trace execution every trace_freq steps display_freq = 50 # write current training images every display_freq steps save_freq = 500 # save model every save_freq steps, 0 to disable separable_conv = False # use separable convolutions in the generator aspect_ratio = 1 #aspect ratio of output images (width/height) batch_size = 1 # help="number of images in batch") which_direction = "BtoA" # choices=["AtoB", "BtoA"]) ngf = 64 # help="number of generator filters in first conv layer") ndf = 64 # help="number of discriminator filters in first conv layer") scale_size = 286 # help="scale images to this size before cropping to 256x256") flip = True # flip images horizontally no_flip = True # don't flip images horizontally lr = 0.0002 # initial learning rate for adam beta1 = 0.5 # momentum term of adam l1_weight = 100.0 # weight on L1 term for generator gradient gan_weight = 1.0 # weight on GAN term for generator gradient output_filetype = "png" # 输出图像的格式 EPS = 1e-12 # 极小数,防止梯度为损失为0 CROP_SIZE = 256 # 图片的裁剪大小 # 命名元组,用于存放加载的数据集合创建好的模型 Examples = collections.namedtuple("Examples", "paths, inputs, targets, count, steps_per_epoch") Model = collections.namedtuple("Model", "outputs, predict_real, predict_fake, discrim_loss, discrim_grads_and_vars, gen_loss_GAN, gen_loss_L1, gen_grads_and_vars, train") # 图像预处理 [0, 1] => [-1, 1] def preprocess(image): with tf.name_scope("preprocess"): return image * 2 - 1 # 图像后处理[-1, 1] => [0, 1] def deprocess(image): with tf.name_scope("deprocess"): return (image + 1) / 2 # 判别器的卷积定义,batch_input为 [ batch , 256 , 256 , 6 ] def discrim_conv(batch_input, out_channels, stride): # [ batch , 256 , 256 , 6 ] ===>[ batch , 258 , 258 , 6 ] padded_input = tf.pad(batch_input, [[0, 0], [1, 1], [1, 1], [0, 0]], mode="CONSTANT") ''' [0,0]: 第一维batch大小不扩充 [1,1]:第二维图像宽度左右各扩充一列,用0填充 [1,1]:第三维图像高度上下各扩充一列,用0填充 [0,0]:第四维图像通道不做扩充 ''' return tf.layers.conv2d(padded_input, out_channels, kernel_size=4, strides=(stride, stride), padding="valid", kernel_initializer=tf.random_normal_initializer(0, 0.02)) # 生成器的卷积定义,卷积核为4*4,步长为2,输出图像为输入的一半 def gen_conv(batch_input, out_channels): # [batch, in_height, in_width, in_channels] => [batch, out_height, out_width, out_channels] initializer = tf.random_normal_initializer(0, 0.02) if separable_conv: return tf.layers.separable_conv2d(batch_input, out_channels, kernel_size=4, strides=(2, 2), padding="same", depthwise_initializer=initializer, pointwise_initializer=initializer) else: return tf.layers.conv2d(batch_input, out_channels, kernel_size=4, strides=(2, 2), padding="same", kernel_initializer=initializer) # 生成器的反卷积定义 def gen_deconv(batch_input, out_channels): # [batch, in_height, in_width, in_channels] => [batch, out_height, out_width, out_channels] initializer = tf.random_normal_initializer(0, 0.02) if separable_conv: _b, h, w, _c = batch_input.shape resized_input = tf.image.resize_images(batch_input, [h * 2, w * 2], method=tf.image.ResizeMethod.NEAREST_NEIGHBOR) return tf.layers.separable_conv2d(resized_input, out_channels, kernel_size=4, strides=(1, 1), padding="same", depthwise_initializer=initializer, pointwise_initializer=initializer) else: return tf.layers.conv2d_transpose(batch_input, out_channels, kernel_size=4, strides=(2, 2), padding="same", kernel_initializer=initializer) # 定义LReLu激活函数 def lrelu(x, a): with tf.name_scope("lrelu"): # adding these together creates the leak part and linear part # then cancels them out by subtracting/adding an absolute value term # leak: a*x/2 - a*abs(x)/2 # linear: x/2 + abs(x)/2 # this block looks like it has 2 inputs on the graph unless we do this x = tf.identity(x) return (0.5 * (1 + a)) * x + (0.5 * (1 - a)) * tf.abs(x) # 批量归一化图像 def batchnorm(inputs): return tf.layers.batch_normalization(inputs, axis=3, epsilon=1e-5, momentum=0.1, training=True, gamma_initializer=tf.random_normal_initializer(1.0, 0.02)) # 检查图像的维度 def check_image(image): assertion = tf.assert_equal(tf.shape(image)[-1], 3, message="image must have 3 color channels") with tf.control_dependencies([assertion]): image = tf.identity(image) if image.get_shape().ndims not in (3, 4): raise ValueError("image must be either 3 or 4 dimensions") # make the last dimension 3 so that you can unstack the colors shape = list(image.get_shape()) shape[-1] = 3 image.set_shape(shape) return image # 去除文件的后缀,获取文件名 def get_name(path): # os.path.basename(),返回path最后的文件名。若path以/或\结尾,那么就会返回空值。 # os.path.splitext(),分离文件名与扩展名;默认返回(fname,fextension)元组 name, _ = os.path.splitext(os.path.basename(path)) return name # 加载数据集,从文件读取-->解码-->归一化--->拆分为输入和目标-->像素转为[-1,1]-->转变形状 def load_examples(input_dir): if input_dir is None or not os.path.exists(input_dir): raise Exception("input_dir does not exist") # 匹配第一个参数的路径中所有的符合条件的文件,并将其以list的形式返回。 input_paths = glob.glob(os.path.join(input_dir, "*.jpg")) # 图像解码器 decode = tf.image.decode_jpeg if len(input_paths) == 0: input_paths = glob.glob(os.path.join(input_dir, "*.png")) decode = tf.image.decode_png if len(input_paths) == 0: raise Exception("input_dir contains no image files") # 如果文件名是数字,则用数字进行排序,否则用字母排序 if all(get_name(path).isdigit() for path in input_paths): input_paths = sorted(input_paths, key=lambda path: int(get_name(path))) else: input_paths = sorted(input_paths) sess = tf.Session() with tf.name_scope("load_images"): # 把我们需要的全部文件打包为一个tf内部的queue类型,之后tf开文件就从这个queue中取目录了, # 如果是训练模式时,shuffle为True path_queue = tf.train.string_input_producer(input_paths, shuffle=True) # Read的输出将是一个文件名(key)和该文件的内容(value,每次读取一个文件,分多次读取)。 reader = tf.WholeFileReader() paths, contents = reader.read(path_queue) # 对文件进行解码并且对图片作归一化处理 raw_input = decode(contents) raw_input = tf.image.convert_image_dtype(raw_input, dtype=tf.float32) # 归一化处理 # 判断两个值知否相等,如果不等抛出异常 assertion = tf.assert_equal(tf.shape(raw_input)[2], 3, message="image does not have 3 channels") ''' 对于control_dependencies这个管理器,只有当里面的操作是一个op时,才会生效,也就是先执行传入的 参数op,再执行里面的op。如果里面的操作不是定义的op,图中就不会形成一个节点,这样该管理器就失效了。 tf.identity是返回一个一模一样新的tensor的op,这会增加一个新节点到gragh中,这时control_dependencies就会生效. ''' with tf.control_dependencies([assertion]): raw_input = tf.identity(raw_input) raw_input.set_shape([None, None, 3]) # 图像值由[0,1]--->[-1, 1] width = tf.shape(raw_input)[1] # [height, width, channels] a_images = preprocess(raw_input[:, :width // 2, :]) # 256*256*3 b_images = preprocess(raw_input[:, width // 2:, :]) # 256*256*3 # 这里的which_direction为:BtoA if which_direction == "AtoB": inputs, targets = [a_images, b_images] elif which_direction == "BtoA": inputs, targets = [b_images, a_images] else: raise Exception("invalid direction") # synchronize seed for image operations so that we do the same operations to both # input and output images seed = random.randint(0, 2 ** 31 - 1) # 图像预处理,翻转、改变形状 with tf.name_scope("input_images"): input_images = transform(inputs) with tf.name_scope("target_images"): target_images = transform(targets) # 获得输入图像、目标图像的batch块 paths_batch, inputs_batch, targets_batch = tf.train.batch([paths, input_images, target_images], batch_size=batch_size) steps_per_epoch = int(math.ceil(len(input_paths) / batch_size)) return Examples( paths=paths_batch, # 输入的文件名块 inputs=inputs_batch, # 输入的图像块 targets=targets_batch, # 目标图像块 count=len(input_paths), # 数据集的大小 steps_per_epoch=steps_per_epoch, # batch的个数 ) # 图像预处理,翻转、改变形状 def transform(image): r = image if flip: r = tf.image.random_flip_left_right(r, seed=seed) # area produces a nice downscaling, but does nearest neighbor for upscaling # assume we're going to be doing downscaling here r = tf.image.resize_images(r, [scale_size, scale_size], method=tf.image.ResizeMethod.AREA) offset = tf.cast(tf.floor(tf.random_uniform([2], 0, scale_size - CROP_SIZE + 1, seed=seed)), dtype=tf.int32) if scale_size > CROP_SIZE: r = tf.image.crop_to_bounding_box(r, offset[0], offset[1], CROP_SIZE, CROP_SIZE) elif scale_size < CROP_SIZE: raise Exception("scale size cannot be less than crop size") return r # 创建生成器,这是一个编码解码器的变种,输入输出均为:256*256*3, 像素值为[-1,1] def create_generator(generator_inputs, generator_outputs_channels): layers = [] # encoder_1: [batch, 256, 256, in_channels] => [batch, 128, 128, ngf] with tf.variable_scope("encoder_1"): output = gen_conv(generator_inputs, ngf) # ngf为第一个卷积层的卷积核核数量,默认为 64 layers.append(output) layer_specs = [ ngf * 2, # encoder_2: [batch, 128, 128, ngf] => [batch, 64, 64, ngf * 2] ngf * 4, # encoder_3: [batch, 64, 64, ngf * 2] => [batch, 32, 32, ngf * 4] ngf * 8, # encoder_4: [batch, 32, 32, ngf * 4] => [batch, 16, 16, ngf * 8] ngf * 8, # encoder_5: [batch, 16, 16, ngf * 8] => [batch, 8, 8, ngf * 8] ngf * 8, # encoder_6: [batch, 8, 8, ngf * 8] => [batch, 4, 4, ngf * 8] ngf * 8, # encoder_7: [batch, 4, 4, ngf * 8] => [batch, 2, 2, ngf * 8] ngf * 8, # encoder_8: [batch, 2, 2, ngf * 8] => [batch, 1, 1, ngf * 8] ] # 卷积的编码器 for out_channels in layer_specs: with tf.variable_scope("encoder_%d" % (len(layers) + 1)): # 对最后一层使用激活函数 rectified = lrelu(layers[-1], 0.2) # [batch, in_height, in_width, in_channels] => [batch, in_height/2, in_width/2, out_channels] convolved = gen_conv(rectified, out_channels) output = batchnorm(convolved) layers.append(output) layer_specs = [ (ngf * 8, 0.5), # decoder_8: [batch, 1, 1, ngf * 8] => [batch, 2, 2, ngf * 8 * 2] (ngf * 8, 0.5), # decoder_7: [batch, 2, 2, ngf * 8 * 2] => [batch, 4, 4, ngf * 8 * 2] (ngf * 8, 0.5), # decoder_6: [batch, 4, 4, ngf * 8 * 2] => [batch, 8, 8, ngf * 8 * 2] (ngf * 8, 0.0), # decoder_5: [batch, 8, 8, ngf * 8 * 2] => [batch, 16, 16, ngf * 8 * 2] (ngf * 4, 0.0), # decoder_4: [batch, 16, 16, ngf * 8 * 2] => [batch, 32, 32, ngf * 4 * 2] (ngf * 2, 0.0), # decoder_3: [batch, 32, 32, ngf * 4 * 2] => [batch, 64, 64, ngf * 2 * 2] (ngf, 0.0), # decoder_2: [batch, 64, 64, ngf * 2 * 2] => [batch, 128, 128, ngf * 2] ] # 卷积的解码器 num_encoder_layers = len(layers) # 8 for decoder_layer, (out_channels, dropout) in enumerate(layer_specs): skip_layer = num_encoder_layers - decoder_layer - 1 with tf.variable_scope("decoder_%d" % (skip_layer + 1)): if decoder_layer == 0: # first decoder layer doesn't have skip connections # since it is directly connected to the skip_layer input = layers[-1] else: input = tf.concat([layers[-1], layers[skip_layer]], axis=3) rectified = tf.nn.relu(input) # [batch, in_height, in_width, in_channels] => [batch, in_height*2, in_width*2, out_channels] output = gen_deconv(rectified, out_channels) output = batchnorm(output) if dropout > 0.0: output = tf.nn.dropout(output, keep_prob=1 - dropout) layers.append(output) # decoder_1: [batch, 128, 128, ngf * 2] => [batch, 256, 256, generator_outputs_channels] with tf.variable_scope("decoder_1"): input = tf.concat([layers[-1], layers[0]], axis=3) rectified = tf.nn.relu(input) output = gen_deconv(rectified, generator_outputs_channels) output = tf.tanh(output) layers.append(output) return layers[-1] # 创建判别器,输入生成的图像和真实的图像:两个[batch,256,256,3],元素值值[-1,1],输出:[batch,30,30,1],元素值为概率 def create_discriminator(discrim_inputs, discrim_targets): n_layers = 3 layers = [] # 2x [batch, height, width, in_channels] => [batch, height, width, in_channels * 2] input = tf.concat([discrim_inputs, discrim_targets], axis=3) # layer_1: [batch, 256, 256, in_channels * 2] => [batch, 128, 128, ndf] with tf.variable_scope("layer_1"): convolved = discrim_conv(input, ndf, stride=2) rectified = lrelu(convolved, 0.2) layers.append(rectified) # layer_2: [batch, 128, 128, ndf] => [batch, 64, 64, ndf * 2] # layer_3: [batch, 64, 64, ndf * 2] => [batch, 32, 32, ndf * 4] # layer_4: [batch, 32, 32, ndf * 4] => [batch, 31, 31, ndf * 8] for i in range(n_layers): with tf.variable_scope("layer_%d" % (len(layers) + 1)): out_channels = ndf * min(2 ** (i + 1), 8) stride = 1 if i == n_layers - 1 else 2 # last layer here has stride 1 convolved = discrim_conv(layers[-1], out_channels, stride=stride) normalized = batchnorm(convolved) rectified = lrelu(normalized, 0.2) layers.append(rectified) # layer_5: [batch, 31, 31, ndf * 8] => [batch, 30, 30, 1] with tf.variable_scope("layer_%d" % (len(layers) + 1)): convolved = discrim_conv(rectified, out_channels=1, stride=1) output = tf.sigmoid(convolved) layers.append(output) return layers[-1] # 创建Pix2Pix模型,inputs和targets形状为:[batch_size, height, width, channels] def create_model(inputs, targets): with tf.variable_scope("generator"): out_channels = int(targets.get_shape()[-1]) outputs = create_generator(inputs, out_channels) # create two copies of discriminator, one for real pairs and one for fake pairs # they share the same underlying variables with tf.name_scope("real_discriminator"): with tf.variable_scope("discriminator"): # 2x [batch, height, width, channels] => [batch, 30, 30, 1] predict_real = create_discriminator(inputs, targets) # 条件变量图像和真实图像 with tf.name_scope("fake_discriminator"): with tf.variable_scope("discriminator", reuse=True): # 2x [batch, height, width, channels] => [batch, 30, 30, 1] predict_fake = create_discriminator(inputs, outputs) # 条件变量图像和生成的图像 # 判别器的损失,判别器希望V(G,D)尽可能大 with tf.name_scope("discriminator_loss"): # minimizing -tf.log will try to get inputs to 1 # predict_real => 1 # predict_fake => 0 discrim_loss = tf.reduce_mean(-(tf.log(predict_real + EPS) + tf.log(1 - predict_fake + EPS))) # 生成器的损失,生成器希望V(G,D)尽可能小 with tf.name_scope("generator_loss"): # predict_fake => 1 # abs(targets - outputs) => 0 gen_loss_GAN = tf.reduce_mean(-tf.log(predict_fake + EPS)) gen_loss_L1 = tf.reduce_mean(tf.abs(targets - outputs)) gen_loss = gen_loss_GAN * gan_weight + gen_loss_L1 * l1_weight # 判别器训练 with tf.name_scope("discriminator_train"): # 判别器需要优化的参数 discrim_tvars = [var for var in tf.trainable_variables() if var.name.startswith("discriminator")] # 优化器定义 discrim_optim = tf.train.AdamOptimizer(lr, beta1) # 计算损失函数对优化参数的梯度 discrim_grads_and_vars = discrim_optim.compute_gradients(discrim_loss, var_list=discrim_tvars) # 更新该梯度所对应的参数的状态,返回一个op discrim_train = discrim_optim.apply_gradients(discrim_grads_and_vars) # 生成器训练 with tf.name_scope("generator_train"): with tf.control_dependencies([discrim_train]): # 生成器需要优化的参数列表 gen_tvars = [var for var in tf.trainable_variables() if var.name.startswith("generator")] # 定义优化器 gen_optim = tf.train.AdamOptimizer(lr, beta1) # 计算需要优化的参数的梯度 gen_grads_and_vars = gen_optim.compute_gradients(gen_loss, var_list=gen_tvars) # 更新该梯度所对应的参数的状态,返回一个op gen_train = gen_optim.apply_gradients(gen_grads_and_vars) ''' 在采用随机梯度下降算法训练神经网络时,使用 tf.train.ExponentialMovingAverage 滑动平均操作的意义在于 提高模型在测试数据上的健壮性(robustness)。tensorflow 下的 tf.train.ExponentialMovingAverage 需要 提供一个衰减率(decay)。该衰减率用于控制模型更新的速度。该衰减率用于控制模型更新的速度, ExponentialMovingAverage 对每一个(待更新训练学习的)变量(variable)都会维护一个影子变量 (shadow variable)。影子变量的初始值就是这个变量的初始值, shadow_variable=decay×shadow_variable+(1−decay)×variable ''' ema = tf.train.ExponentialMovingAverage(decay=0.99) update_losses = ema.apply([discrim_loss, gen_loss_GAN, gen_loss_L1]) # global_step = tf.train.get_or_create_global_step() incr_global_step = tf.assign(global_step, global_step + 1) return Model( predict_real=predict_real, # 条件变量(输入图像)和真实图像之间的概率值,形状为;[batch,30,30,1] predict_fake=predict_fake, # 条件变量(输入图像)和生成图像之间的概率值,形状为;[batch,30,30,1] discrim_loss=ema.average(discrim_loss), # 判别器损失 discrim_grads_and_vars=discrim_grads_and_vars, # 判别器需要优化的参数和对应的梯度 gen_loss_GAN=ema.average(gen_loss_GAN), # 生成器的损失 gen_loss_L1=ema.average(gen_loss_L1), # 生成器的 L1损失 gen_grads_and_vars=gen_grads_and_vars, # 生成器需要优化的参数和对应的梯度 outputs=outputs, # 生成器生成的图片 train=tf.group(update_losses, incr_global_step, gen_train), # 打包需要run的操作op ) # 保存图像 def save_images(output_dir, fetches, step=None): image_dir = os.path.join(output_dir, "images") if not os.path.exists(image_dir): os.makedirs(image_dir) filesets = [] for i, in_path in enumerate(fetches["paths"]): name, _ = os.path.splitext(os.path.basename(in_path.decode("utf8"))) fileset = {"name": name, "step": step} for kind in ["inputs", "outputs", "targets"]: filename = name + "-" + kind + ".png" if step is not None: filename = "%08d-%s" % (step, filename) fileset[kind] = filename out_path = os.path.join(image_dir, filename) contents = fetches[kind][i] with open(out_path, "wb") as f: f.write(contents) filesets.append(fileset) return filesets # 将结果写入HTML网页 def append_index(output_dir, filesets, step=False): index_path = os.path.join(output_dir, "index.html") if os.path.exists(index_path): index = open(index_path, "a") else: index = open(index_path, "w") index.write("<html><body><table><tr>") if step: index.write("<th>step</th>") index.write("<th>name</th><th>input</th><th>output</th><th>target</th></tr>") for fileset in filesets: index.write("<tr>") if step: index.write("<td>%d</td>" % fileset["step"]) index.write("<td>%s</td>" % fileset["name"]) for kind in ["inputs", "outputs", "targets"]: index.write("<td><img src='images/%s'></td>" % fileset[kind]) index.write("</tr>") return index_path # 转变图像的尺寸、并且将[0,1]--->[0,255] def convert(image): if aspect_ratio != 1.0: # upscale to correct aspect ratio size = [CROP_SIZE, int(round(CROP_SIZE * aspect_ratio))] image = tf.image.resize_images(image, size=size, method=tf.image.ResizeMethod.BICUBIC) # 将数据的类型转换为8位无符号整型 return tf.image.convert_image_dtype(image, dtype=tf.uint8, saturate=True) # 主函数 def train(): # 设置随机数种子的值 global seed if seed is None: seed = random.randint(0, 2 ** 31 - 1) tf.set_random_seed(seed) np.random.seed(seed) random.seed(seed) # 创建目录 if not os.path.exists(train_output_dir): os.makedirs(train_output_dir) # 加载数据集,得到输入数据和目标数据并把范围变为 :[-1,1] examples = load_examples(train_input_dir) print("load successful ! examples count = %d" % examples.count) # 创建模型,inputs和targets是:[batch_size, height, width, channels] # 返回值: model = create_model(examples.inputs, examples.targets) print("create model successful!") # 图像处理[-1, 1] => [0, 1] inputs = deprocess(examples.inputs) targets = deprocess(examples.targets) outputs = deprocess(model.outputs) # 把[0,1]的像素点转为RGB值:[0,255] with tf.name_scope("convert_inputs"): converted_inputs = convert(inputs) with tf.name_scope("convert_targets"): converted_targets = convert(targets) with tf.name_scope("convert_outputs"): converted_outputs = convert(outputs) # 对图像进行编码以便于保存 with tf.name_scope("encode_images"): display_fetches = { "paths": examples.paths, # tf.map_fn接受一个函数对象和集合,用函数对集合中每个元素分别处理 "inputs": tf.map_fn(tf.image.encode_png, converted_inputs, dtype=tf.string, name="input_pngs"), "targets": tf.map_fn(tf.image.encode_png, converted_targets, dtype=tf.string, name="target_pngs"), "outputs": tf.map_fn(tf.image.encode_png, converted_outputs, dtype=tf.string, name="output_pngs"), } with tf.name_scope("parameter_count"): parameter_count = tf.reduce_sum([tf.reduce_prod(tf.shape(v)) for v in tf.trainable_variables()]) # 只保存最新一个checkpoint saver = tf.train.Saver(max_to_keep=20) init = tf.global_variables_initializer() with tf.Session() as sess: sess.run(init) print("parameter_count =", sess.run(parameter_count)) if max_epochs is not None: max_steps = examples.steps_per_epoch * max_epochs # 400X200=80000 # 因为是从文件中读取数据,所以需要启动start_queue_runners() # 这个函数将会启动输入管道的线程,填充样本到队列中,以便出队操作可以从队列中拿到样本。 coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(coord=coord) # 运行训练集 print("begin trainning......") print("max_steps:", max_steps) start = time.time() for step in range(max_steps): def should(freq): return freq > 0 and ((step + 1) % freq == 0 or step == max_steps - 1) print("step:", step) # 定义一个需要run的所有操作的字典 fetches = { "train": model.train } # progress_freq为 50,每50次计算一次三个损失,显示进度 if should(progress_freq): fetches["discrim_loss"] = model.discrim_loss fetches["gen_loss_GAN"] = model.gen_loss_GAN fetches["gen_loss_L1"] = model.gen_loss_L1 # display_freq为 50,每50次保存一次输入、目标、输出的图像 if should(display_freq): fetches["display"] = display_fetches # 运行各种操作, results = sess.run(fetches) # display_freq为 50,每50次保存输入、目标、输出的图像 if should(display_freq): print("saving display images") filesets = save_images(train_output_dir, results["display"], step=step) append_index(train_output_dir, filesets, step=True) # progress_freq为 50,每50次打印一次三种损失的大小,显示进度 if should(progress_freq): # global_step will have the correct step count if we resume from a checkpoint train_epoch = math.ceil(step / examples.steps_per_epoch) train_step = (step - 1) % examples.steps_per_epoch + 1 rate = (step + 1) * batch_size / (time.time() - start) remaining = (max_steps - step) * batch_size / rate print("progress epoch %d step %d image/sec %0.1f remaining %dm" % ( train_epoch, train_step, rate, remaining / 60)) print("discrim_loss", results["discrim_loss"]) print("gen_loss_GAN", results["gen_loss_GAN"]) print("gen_loss_L1", results["gen_loss_L1"]) # save_freq为500,每500次保存一次模型 if should(save_freq): print("saving model") saver.save(sess, os.path.join(train_output_dir, "model"), global_step=step) # 测试 def test(): # 设置随机数种子的值 global seed if seed is None: seed = random.randint(0, 2 ** 31 - 1) tf.set_random_seed(seed) np.random.seed(seed) random.seed(seed) # 创建目录 if not os.path.exists(test_output_dir): os.makedirs(test_output_dir) if checkpoint is None: raise Exception("checkpoint required for test mode") # disable these features in test mode scale_size = CROP_SIZE flip = False # 加载数据集,得到输入数据和目标数据 examples = load_examples(test_input_dir) print("load successful ! examples count = %d" % examples.count) # 创建模型,inputs和targets是:[batch_size, height, width, channels] model = create_model(examples.inputs, examples.targets) print("create model successful!") # 图像处理[-1, 1] => [0, 1] inputs = deprocess(examples.inputs) targets = deprocess(examples.targets) outputs = deprocess(model.outputs) # 把[0,1]的像素点转为RGB值:[0,255] with tf.name_scope("convert_inputs"): converted_inputs = convert(inputs) with tf.name_scope("convert_targets"): converted_targets = convert(targets) with tf.name_scope("convert_outputs"): converted_outputs = convert(outputs) # 对图像进行编码以便于保存 with tf.name_scope("encode_images"): display_fetches = { "paths": examples.paths, # tf.map_fn接受一个函数对象和集合,用函数对集合中每个元素分别处理 "inputs": tf.map_fn(tf.image.encode_png, converted_inputs, dtype=tf.string, name="input_pngs"), "targets": tf.map_fn(tf.image.encode_png, converted_targets, dtype=tf.string, name="target_pngs"), "outputs": tf.map_fn(tf.image.encode_png, converted_outputs, dtype=tf.string, name="output_pngs"), } sess = tf.InteractiveSession() saver = tf.train.Saver(max_to_keep=1) ckpt = tf.train.get_checkpoint_state(checkpoint) saver.restore(sess,ckpt.model_checkpoint_path) start = time.time() coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(coord=coord) for step in range(examples.count): results = sess.run(display_fetches) filesets = save_images(test_output_dir, results) for i, f in enumerate(filesets): print("evaluated image", f["name"]) index_path = append_index(test_output_dir, filesets) print("wrote index at", index_path) print("rate", (time.time() - start) / max_steps) if __name__ == '__main__': train() #test()

Tensorflow建一个神经网络,输出数据只有一个谱型,且杂乱

建了一个神经网络,输入节点3个,输出250个,两个隐藏层,节点数分别为200个。 训练数据集为100000个。运行完后用测试集验证,发现预测的谱线杂乱无章,跟测试的谱线集完全无关,从图中看感觉是在一个谱型附近震荡。 初学者不明白是什么原因,不知有没有大神可以稍加指教。 ``` import tensorflow as tf import numpy as np # 添加层 def add_layer(inputs, in_size, out_size,n_layer,activation_function=None): Weights = tf.Variable(tf.random_normal([in_size, out_size])) Wx_plus_b = tf.matmul(inputs, Weights) if activation_function is None: outputs = Wx_plus_b else: outputs = activation_function(Wx_plus_b) return outputs # 1.训练的数据 p_1= np.loadtxt('D:p_train.txt') p=np.reshape(p_1,(3,100000)) s_1= np.loadtxt('D:s_train.txt') s=np.reshape(s_1,(250,100000)) pmin=p.min() pmax=p.max() p_train=(p-pmin)/(pmax-pmin) smin=s.min() smax=s.max() s_train=(s-smin)/(smax-smin) p_train=np.transpose(p_train) s_train=np.transpose(s_train) p_train=p_train.tolist() s_train=s_train.tolist() # 2.测试的数据 p_2=np.loadtxt('D:p_test.txt') p2=np.reshape(p_2,(3,5501)) s_2=np.loadtxt('D:s_test.txt') s2=np.reshape(s_2,(250,5501)) pmin2=p2.min() pmax2=p2.max() p_test=(p2-pmin2)/(pmax2-pmin2) smin2=s2.min() smax2=s2.max() s_test=(s2-smin2)/(smax2-smin2) p_test=np.transpose(p_test) s_test=np.transpose(s_test) p_test=p_test.tolist() s_test=s_test.tolist() # 3.定义占位符 px = tf.placeholder(tf.float32, [None, 3]) sx = tf.placeholder(tf.float32, [None,250]) sy=tf.placeholder(tf.float32,[None,250]) # 4.定义神经层:隐藏层和预测层 l1 = add_layer(px, 3, 200, n_layer=1,activation_function=tf.nn.sigmoid) l2=add_layer(l1,200,200,n_layer=2,activation_function=tf.nn.sigmoid) prediction = add_layer(l2, 200, 250, n_layer=3,activation_function=None) # 5.定义 loss 表达式 mse loss = tf.reduce_mean(tf.square(sx - prediction)) #loss2 # 6.选择 optimizer 使 loss 达到最小 train_step = tf.train.AdamOptimizer(0.01,epsilon=1e-8).minimize(loss) #7.初始化变量 init=tf.initialize_all_variables() #8.定义会话 sess = tf.Session() #9.运行 sess.run(init) #10.查看loss变化 for step in range(1000): sess.run(train_step, feed_dict={px:p_train, sx:s_train}) if step % 50 == 0: print(sess.run(loss,feed_dict={sx:s_train,px:p_train})) prediction_test=sess.run(prediction,feed_dict={px:p_test}) ```

2019 Python开发者日-培训

2019 Python开发者日-培训

150讲轻松搞定Python网络爬虫

150讲轻松搞定Python网络爬虫

设计模式(JAVA语言实现)--20种设计模式附带源码

设计模式(JAVA语言实现)--20种设计模式附带源码

YOLOv3目标检测实战:训练自己的数据集

YOLOv3目标检测实战:训练自己的数据集

java后台+微信小程序 实现完整的点餐系统

java后台+微信小程序 实现完整的点餐系统

三个项目玩转深度学习(附1G源码)

三个项目玩转深度学习(附1G源码)

初级玩转Linux+Ubuntu(嵌入式开发基础课程)

初级玩转Linux+Ubuntu(嵌入式开发基础课程)

2019 AI开发者大会

2019 AI开发者大会

玩转Linux:常用命令实例指南

玩转Linux:常用命令实例指南

一学即懂的计算机视觉(第一季)

一学即懂的计算机视觉(第一季)

4小时玩转微信小程序——基础入门与微信支付实战

4小时玩转微信小程序——基础入门与微信支付实战

Git 实用技巧

Git 实用技巧

Python数据清洗实战入门

Python数据清洗实战入门

使用TensorFlow+keras快速构建图像分类模型

使用TensorFlow+keras快速构建图像分类模型

实用主义学Python(小白也容易上手的Python实用案例)

实用主义学Python(小白也容易上手的Python实用案例)

程序员的算法通关课:知己知彼(第一季)

程序员的算法通关课:知己知彼(第一季)

MySQL数据库从入门到实战应用

MySQL数据库从入门到实战应用

机器学习初学者必会的案例精讲

机器学习初学者必会的案例精讲

手把手实现Java图书管理系统(附源码)

手把手实现Java图书管理系统(附源码)

极简JAVA学习营第四期(报名以后加助教微信:eduxy-1)

极简JAVA学习营第四期(报名以后加助教微信:eduxy-1)

.net core快速开发框架

.net core快速开发框架

玩转Python-Python3基础入门

玩转Python-Python3基础入门

Python数据挖掘简易入门

Python数据挖掘简易入门

微信公众平台开发入门

微信公众平台开发入门

程序员的兼职技能课

程序员的兼职技能课

Windows版YOLOv4目标检测实战:训练自己的数据集

Windows版YOLOv4目标检测实战:训练自己的数据集

HoloLens2开发入门教程

HoloLens2开发入门教程

微信小程序开发实战

微信小程序开发实战

Java8零基础入门视频教程

Java8零基础入门视频教程

相关热词 c# 按行txt c#怎么扫条形码 c#打包html c# 实现刷新数据 c# 两个自定义控件重叠 c#浮点类型计算 c#.net 中文乱码 c# 时间排序 c# 必备书籍 c#异步网络通信
立即提问