tensorflow当中的loss里面的logits可不可以是placeholder

我使用tensorflow实现手写数字识别,我希望softmax_cross_entropy_with_logits里面的logits先用一个placeholder表示,然后在计算的时候再通过计算出的值再传给placeholder,但是会报错ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients。我知道直接把logits那里改成outputs就可以了,但是如果我一定要用logits的结果先是一个placeholder,我应该怎么解决呢。

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/home/as/下载/resnet-152_mnist-master/mnist_dataset", one_hot=True)

from tensorflow.contrib.layers import fully_connected

x = tf.placeholder(dtype=tf.float32,shape=[None,784])
y = tf.placeholder(dtype=tf.float32,shape=[None,1])

hidden1 = fully_connected(x,100,activation_fn=tf.nn.elu,
                         weights_initializer=tf.random_normal_initializer())

hidden2 = fully_connected(hidden1,200,activation_fn=tf.nn.elu,
                         weights_initializer=tf.random_normal_initializer())
hidden3 = fully_connected(hidden2,200,activation_fn=tf.nn.elu,
                         weights_initializer=tf.random_normal_initializer())


outputs = fully_connected(hidden3,10,activation_fn=None,
                         weights_initializer=tf.random_normal_initializer())




a = tf.placeholder(tf.float32,[None,10])


loss = tf.nn.softmax_cross_entropy_with_logits(labels=y,logits=a)
reduce_mean_loss = tf.reduce_mean(loss)

equal_result = tf.equal(tf.argmax(outputs,1),tf.argmax(y,1))
cast_result = tf.cast(equal_result,dtype=tf.float32)
accuracy = tf.reduce_mean(cast_result)

train_op = tf.train.AdamOptimizer(0.001).minimize(reduce_mean_loss)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(30000):
        xs,ys = mnist.train.next_batch(128)
        result = outputs.eval(feed_dict={x:xs})
        sess.run(train_op,feed_dict={a:result,y:ys})
        print(i)

2个回答

可以是placeholder

为什么要占位,a = tf.placeholder(tf.float32,[None,10])
输入才会使用占位,logits是网络输出!

Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
其他相关推荐
tensorflow重载模型继续训练得到的loss比原模型继续训练得到的loss大,是什么原因??

我使用tensorflow训练了一个模型,在第10个epoch时保存模型,然后在一个新的文件里重载模型继续训练,结果我发现重载的模型在第一个epoch的loss比原模型在epoch=11的loss要大,我感觉既然是重载了原模型,那么重载模型训练的第一个epoch应该是和原模型训练的第11个epoch相等的,一直找不到问题或者自己逻辑的问题,希望大佬能指点迷津。源代码和重载模型的代码如下: ``` 原代码: from tensorflow.examples.tutorials.mnist import input_data import tensorflow as tf import os import numpy as np os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' mnist = input_data.read_data_sets("./",one_hot=True) tf.reset_default_graph() ###定义数据和标签 n_inputs = 784 n_classes = 10 X = tf.placeholder(tf.float32,[None,n_inputs],name='X') Y = tf.placeholder(tf.float32,[None,n_classes],name='Y') ###定义网络结构 n_hidden_1 = 256 n_hidden_2 = 256 layer_1 = tf.layers.dense(inputs=X,units=n_hidden_1,activation=tf.nn.relu,kernel_regularizer=tf.contrib.layers.l2_regularizer(0.01)) layer_2 = tf.layers.dense(inputs=layer_1,units=n_hidden_2,activation=tf.nn.relu,kernel_regularizer=tf.contrib.layers.l2_regularizer(0.01)) outputs = tf.layers.dense(inputs=layer_2,units=n_classes,name='outputs') pred = tf.argmax(tf.nn.softmax(outputs,axis=1),axis=1) print(pred.name) err = tf.count_nonzero((pred - tf.argmax(Y,axis=1))) cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=outputs,labels=Y),name='cost') print(cost.name) ###定义优化器 learning_rate = 0.001 optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost,name='OP') saver = tf.train.Saver() checkpoint = 'softmax_model/dense_model.cpkt' ###训练 batch_size = 100 training_epochs = 11 with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for epoch in range(training_epochs): batch_num = int(mnist.train.num_examples / batch_size) epoch_cost = 0 sumerr = 0 for i in range(batch_num): batch_x,batch_y = mnist.train.next_batch(batch_size) c,e = sess.run([cost,err],feed_dict={X:batch_x,Y:batch_y}) _ = sess.run(optimizer,feed_dict={X:batch_x,Y:batch_y}) epoch_cost += c / batch_num sumerr += e / mnist.train.num_examples if epoch == (training_epochs - 1): print('batch_cost = ',c) if epoch == (training_epochs - 2): saver.save(sess, checkpoint) print('test_error = ',sess.run(cost, feed_dict={X: mnist.test.images, Y: mnist.test.labels})) ``` ``` 重载模型的代码: from tensorflow.examples.tutorials.mnist import input_data import tensorflow as tf import os os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' mnist = input_data.read_data_sets("./",one_hot=True) #one_hot=True指对样本标签进行独热编码 file_path = 'softmax_model/dense_model.cpkt' saver = tf.train.import_meta_graph(file_path + '.meta') graph = tf.get_default_graph() X = graph.get_tensor_by_name('X:0') Y = graph.get_tensor_by_name('Y:0') cost = graph.get_operation_by_name('cost').outputs[0] train_op = graph.get_operation_by_name('OP') training_epoch = 10 learning_rate = 0.001 batch_size = 100 with tf.Session() as sess: saver.restore(sess,file_path) print('test_cost = ',sess.run(cost, feed_dict={X: mnist.test.images, Y: mnist.test.labels})) for epoch in range(training_epoch): batch_num = int(mnist.train.num_examples / batch_size) epoch_cost = 0 for i in range(batch_num): batch_x, batch_y = mnist.train.next_batch(batch_size) c = sess.run(cost, feed_dict={X: batch_x, Y: batch_y}) _ = sess.run(train_op, feed_dict={X: batch_x, Y: batch_y}) epoch_cost += c / batch_num print(epoch_cost) ``` 值得注意的是,我在原模型和重载模型里都计算了测试集的cost,两者的结果是一致的。说明参数载入应该是对的

修改的SSD—Tensorflow 版本在训练的时候遇到loss输入维度不一致

目前在学习目标检测识别的方向。 自己参考了一些论文 对原版的SSD进行了一些改动工作 前面的网络模型部分已经修改完成且不报错。 但是在进行训练操作的时候会出现 ’ValueError: Dimension 0 in both shapes must be equal, but are 233920 and 251392. Shapes are [233920] and [251392]. for 'ssd_losses/Select' (op: 'Select') with input shapes: [251392], [233920], [251392]. ‘ ‘两个形状中的尺寸0必须相等,但分别为233920和251392。形状有[233920]和[251392]。对于输入形状为[251392]、[233920]、[251392]的''ssd_losses/Select' (op: 'Select') ![图片说明](https://img-ask.csdn.net/upload/201904/06/1554539638_631515.png) ![图片说明](https://img-ask.csdn.net/upload/201904/06/1554539651_430990.png) # SSD loss function. # =========================================================================== # def ssd_losses(logits, localisations, gclasses, glocalisations, gscores, match_threshold=0.5, negative_ratio=3., alpha=1., label_smoothing=0., device='/cpu:0', scope=None): with tf.name_scope(scope, 'ssd_losses'): lshape = tfe.get_shape(logits[0], 5) num_classes = lshape[-1] batch_size = lshape[0] # Flatten out all vectors! flogits = [] fgclasses = [] fgscores = [] flocalisations = [] fglocalisations = [] for i in range(len(logits)): flogits.append(tf.reshape(logits[i], [-1, num_classes])) fgclasses.append(tf.reshape(gclasses[i], [-1])) fgscores.append(tf.reshape(gscores[i], [-1])) flocalisations.append(tf.reshape(localisations[i], [-1, 4])) fglocalisations.append(tf.reshape(glocalisations[i], [-1, 4])) # And concat the crap! logits = tf.concat(flogits, axis=0) gclasses = tf.concat(fgclasses, axis=0) gscores = tf.concat(fgscores, axis=0) localisations = tf.concat(flocalisations, axis=0) glocalisations = tf.concat(fglocalisations, axis=0) dtype = logits.dtype # Compute positive matching mask... pmask = gscores > match_threshold fpmask = tf.cast(pmask, dtype) n_positives = tf.reduce_sum(fpmask) # Hard negative mining... no_classes = tf.cast(pmask, tf.int32) predictions = slim.softmax(logits) nmask = tf.logical_and(tf.logical_not(pmask), gscores > -0.5) fnmask = tf.cast(nmask, dtype) nvalues = tf.where(nmask, predictions[:, 0], 1. - fnmask) nvalues_flat = tf.reshape(nvalues, [-1]) # Number of negative entries to select. max_neg_entries = tf.cast(tf.reduce_sum(fnmask), tf.int32) n_neg = tf.cast(negative_ratio * n_positives, tf.int32) + batch_size n_neg = tf.minimum(n_neg, max_neg_entries) val, idxes = tf.nn.top_k(-nvalues_flat, k=n_neg) max_hard_pred = -val[-1] # Final negative mask. nmask = tf.logical_and(nmask, nvalues < max_hard_pred) fnmask = tf.cast(nmask, dtype) # Add cross-entropy loss. with tf.name_scope('cross_entropy_pos'): loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=gclasses) loss = tf.div(tf.reduce_sum(loss * fpmask), batch_size, name='value') tf.losses.add_loss(loss) with tf.name_scope('cross_entropy_neg'): loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=no_classes) loss = tf.div(tf.reduce_sum(loss * fnmask), batch_size, name='value') tf.losses.add_loss(loss) # Add localization loss: smooth L1, L2, ... with tf.name_scope('localization'): # Weights Tensor: positive mask + random negative. weights = tf.expand_dims(alpha * fpmask, axis=-1) loss = custom_layers.abs_smooth(localisations - glocalisations) loss = tf.div(tf.reduce_sum(loss * weights), batch_size, name='value') tf.losses.add_loss(loss) ``` ``` 研究了一段时间的源码 (因为只是SSD-Tensorflow-Master中的ssd_vgg_300.py中定义网络结构的那部分做了修改 ,loss函数代码部分并没有进行改动)所以没所到错误所在,网上也找不到相关的解决方案。 希望大神能够帮忙解答 感激不尽~

卷积神经网络训练loss变为nan

卷积神经网络训练,用的是mnist数据集,第一次训练前损失函数还是一个值,训练一次之后就变成nan了,使用的损失函数是ce = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y, labels=tf.argmax(y_, 1)),cem = tf.reduce.mean(ce),应该不会出现真数为零或负的情况,而且训练前loss是存在的,只是训练后变为nan,求各位大牛答疑解惑,感激不尽。![图片说明](https://img-ask.csdn.net/upload/201902/18/1550420090_522553.png)![图片说明](https://img-ask.csdn.net/upload/201902/18/1550420096_230838.png)![图片说明](https://img-ask.csdn.net/upload/201902/18/1550420101_914053.png)

为什么同样的问题用Tensorflow和keras实现结果不一样?

**cifar-10分类问题,同样的模型结构以及损失函数还有学习率参数等超参数,分别用TensorFlow和keras实现。 20个epochs后在测试集上进行预测,准确率总是差好几个百分点,不知道问题出在哪里?代码如下: 这个是TF的代码:** import tensorflow as tf import numpy as np import pickle as pk tf.reset_default_graph() batch_size = 64 test_size = 10000 img_size = 32 num_classes = 10 training_epochs = 10 test_size=200 ############################################################################### def unpickle(filename): '''解压数据''' with open(filename, 'rb') as f: d = pk.load(f, encoding='latin1') return d def onehot(labels): '''one-hot 编码''' n_sample = len(labels) n_class = max(labels) + 1 onehot_labels = np.zeros((n_sample, n_class)) onehot_labels[np.arange(n_sample), labels] = 1 return onehot_labels # 训练数据集 data1 = unpickle('data_batch_1') data2 = unpickle('data_batch_2') data3 = unpickle('data_batch_3') data4 = unpickle('data_batch_4') data5 = unpickle('data_batch_5') X_train = np.concatenate((data1['data'], data2['data'], data3['data'], data4['data'], data5['data']), axis=0)/255.0 y_train = np.concatenate((data1['labels'], data2['labels'], data3['labels'], data4['labels'], data5['labels']), axis=0) y_train = onehot(y_train) # 测试数据集 test = unpickle('test_batch') X_test = test['data']/255.0 y_test = onehot(test['labels']) del test,data1,data2,data3,data4,data5 ############################################################################### w = tf.Variable(tf.random_normal([5, 5, 3, 32], stddev=0.01)) w_c= tf.Variable(tf.random_normal([32* 16* 16, 512], stddev=0.1)) w_o =tf.Variable(tf.random_normal([512, num_classes], stddev=0.1)) def init_bias(shape): return tf.Variable(tf.constant(0.0, shape=shape)) b=init_bias([32]) b_c=init_bias([512]) b_o=init_bias([10]) def model(X, w, w_c,w_o, p_keep_conv, p_keep_hidden,b,b_c,b_o): conv1 = tf.nn.conv2d(X, w,strides=[1, 1, 1, 1],padding='SAME')#32x32x32 conv1=tf.nn.bias_add(conv1,b) conv1 = tf.nn.relu(conv1) conv1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1],strides=[1, 2, 2, 1],padding='SAME')#16x16x32 conv1 = tf.nn.dropout(conv1, p_keep_conv) FC_layer = tf.reshape(conv1, [-1, 32 * 16 * 16]) out_layer=tf.matmul(FC_layer, w_c)+b_c out_layer=tf.nn.relu(out_layer) out_layer = tf.nn.dropout(out_layer, p_keep_hidden) result = tf.matmul(out_layer, w_o)+b_o return result trX, trY, teX, teY = X_train,y_train,X_test,y_test trX = trX.reshape(-1, img_size, img_size, 3) teX = teX.reshape(-1, img_size, img_size, 3) X = tf.placeholder("float", [None, img_size, img_size, 3]) Y = tf.placeholder("float", [None, num_classes]) p_keep_conv = tf.placeholder("float") p_keep_hidden = tf.placeholder("float") py_x = model(X, w, w_c,w_o, p_keep_conv, p_keep_hidden,b,b_c,b_o) Y_ = tf.nn.softmax_cross_entropy_with_logits_v2(logits=py_x, labels=Y) cost = tf.reduce_mean(Y_) optimizer = tf.train.RMSPropOptimizer(0.001, 0.9).minimize(cost) predict_op = tf.argmax(py_x, 1) with tf.Session() as sess: tf.global_variables_initializer().run() for i in range(training_epochs): training_batch = zip(range(0, len(trX),batch_size),range(batch_size, len(trX)+1,batch_size)) perm=np.arange(len(trX)) np.random.shuffle(perm) trX=trX[perm] trY=trY[perm] for start, end in training_batch: sess.run(optimizer, feed_dict={X: trX[start:end],Y: trY[start:end],p_keep_conv:0.75,p_keep_hidden: 0.5}) test_batch = zip(range(0, len(teX),test_size),range(test_size, len(teX)+1,test_size)) accuracyResult=0 for start, end in test_batch: accuracyResult=accuracyResult+sum(np.argmax(teY[start:end], axis=1) ==sess.run(predict_op, feed_dict={X: teX[start:end],Y: teY[start:end],p_keep_conv: 1,p_keep_hidden: 1})) print(i, accuracyResult/10000) **这个是keras代码:** from keras import initializers from keras.datasets import cifar10 from keras.utils import np_utils from keras.models import Sequential from keras.layers.core import Dense, Dropout, Activation, Flatten from keras.layers.convolutional import Conv2D, MaxPooling2D from keras.optimizers import SGD, Adam, RMSprop #import matplotlib.pyplot as plt # CIFAR_10 is a set of 60K images 32x32 pixels on 3 channels IMG_CHANNELS = 3 IMG_ROWS = 32 IMG_COLS = 32 #constant BATCH_SIZE = 64 NB_EPOCH = 10 NB_CLASSES = 10 VERBOSE = 1 VALIDATION_SPLIT = 0 OPTIM = RMSprop() #load dataset (X_train, y_train), (X_test, y_test) = cifar10.load_data() #print('X_train shape:', X_train.shape) #print(X_train.shape[0], 'train samples') #print(X_test.shape[0], 'test samples') # convert to categorical Y_train = np_utils.to_categorical(y_train, NB_CLASSES) Y_test = np_utils.to_categorical(y_test, NB_CLASSES) # float and normalization X_train = X_train.astype('float32') X_test = X_test.astype('float32') X_train /= 255 X_test /= 255 # network model = Sequential() model.add(Conv2D(32, (3, 3), padding='same',input_shape=(IMG_ROWS, IMG_COLS, IMG_CHANNELS),kernel_initializer=initializers.random_normal(stddev=0.01),bias_initializer=initializers.Zeros())) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) #0<参数<1才会有用 model.add(Flatten()) model.add(Dense(512,kernel_initializer=initializers.random_normal(stddev=0.1),bias_initializer=initializers.Zeros())) model.add(Activation('relu')) model.add(Dropout(0.5)) model.add(Dense(NB_CLASSES,kernel_initializer=initializers.random_normal(stddev=0.1),bias_initializer=initializers.Zeros())) model.add(Activation('softmax')) model.summary() # train model.compile(loss='categorical_crossentropy', optimizer=OPTIM,metrics=['accuracy']) model.fit(X_train, Y_train, batch_size=BATCH_SIZE,epochs=NB_EPOCH, validation_split=VALIDATION_SPLIT,verbose=VERBOSE) score = model.evaluate(X_test, Y_test,batch_size=200, verbose=VERBOSE) print("Test score:", score[0]) print('Test accuracy:', score[1])

求助,Tensorflow搭建AlexNet模型是训练集验证集的LOSS不收敛

如题,代码如下,请大佬赐教 ``` # coding:utf-8 import tensorflow as tf import numpy as np import time import os import cv2 import matplotlib.pyplot as plt def get_file(file_dir): images = [] labels = [] for root, sub_folders, files in os.walk(file_dir): for name in files: images.append(os.path.join(root, name)) letter = name.split('.')[0] # 对标签进行分类 if letter == 'cat': labels = np.append(labels, [0]) else: labels = np.append(labels, [1]) # shuffle(随机打乱) temp = np.array([images, labels]) temp = temp.transpose() # 建立images 与 labels 之间关系, 以矩阵形式展现 np.random.shuffle(temp) image_list = list(temp[:, 0]) label_list = list(temp[:, 1]) label_list = [int(float(i)) for i in label_list] print(image_list) print(label_list) return image_list, label_list # 返回文件名列表 def _parse_function(image_list, labels_list): image_contents = tf.read_file(image_list) image = tf.image.decode_jpeg(image_contents, channels=3) image = tf.cast(image, tf.float32) image = tf.image.resize_image_with_crop_or_pad(image, 227, 227) # 剪裁或填充处理 image = tf.image.per_image_standardization(image) # 图片标准化 labels = labels_list return image, labels # 将需要读取的数据集地址转换为专用格式 def get_batch(image_list, labels_list, batch_size): image_list = tf.cast(image_list, tf.string) labels_list = tf.cast(labels_list, tf.int32) dataset = tf.data.Dataset.from_tensor_slices((image_list, labels_list)) # 创建dataset dataset = dataset.repeat() # 无限循环 dataset = dataset.map(_parse_function) dataset = dataset.batch(batch_size) dataset = dataset.make_one_shot_iterator() return dataset # 正则化处理数据集 def batch_norm(inputs, is_training, is_conv_out=True, decay=0.999): scale = tf.Variable(tf.ones([inputs.get_shape()[-1]])) beta = tf.Variable(tf.zeros([inputs.get_shape()[-1]])) pop_mean = tf.Variable(tf.zeros([inputs.get_shape()[-1]]), trainable=False) pop_var = tf.Variable(tf.ones([inputs.get_shape()[-1]]), trainable=False) def batch_norm_train(): if is_conv_out: batch_mean, batch_var = tf.nn.moments(inputs, [0, 1, 2]) # 求均值及方差 else: batch_mean, batch_var = tf.nn.moments(inputs, [0]) train_mean = tf.assign(pop_mean, pop_mean * decay + batch_mean * (1 - decay)) train_var = tf.assign(pop_var, pop_var * decay + batch_var * (1 - decay)) with tf.control_dependencies([train_mean, train_var]): # 在train_mean, train_var计算完条件下继续 return tf.nn.batch_normalization(inputs, batch_mean, batch_var, beta, scale, 0.001) def batch_norm_test(): return tf.nn.batch_normalization(inputs, pop_mean, pop_var, beta, scale, 0.001) batch_normalization = tf.cond(is_training, batch_norm_train, batch_norm_test) return batch_normalization # 建立模型 learning_rate = 1e-4 training_iters = 200 batch_size = 50 display_step = 5 n_classes = 2 n_fc1 = 4096 n_fc2 = 2048 # 构建模型 x = tf.placeholder(tf.float32, [None, 227, 227, 3]) y = tf.placeholder(tf.int32, [None]) is_training = tf.placeholder(tf.bool) # 字典模式管理权重与偏置参数 W_conv = { 'conv1': tf.Variable(tf.truncated_normal([11, 11, 3, 96], stddev=0.0001)), 'conv2': tf.Variable(tf.truncated_normal([5, 5, 96, 256], stddev=0.01)), 'conv3': tf.Variable(tf.truncated_normal([3, 3, 256, 384], stddev=0.01)), 'conv4': tf.Variable(tf.truncated_normal([3, 3, 384, 384], stddev=0.01)), 'conv5': tf.Variable(tf.truncated_normal([3, 3, 384, 256], stddev=0.01)), 'fc1': tf.Variable(tf.truncated_normal([6 * 6 * 256, n_fc1], stddev=0.1)), 'fc2': tf.Variable(tf.truncated_normal([n_fc1, n_fc2], stddev=0.1)), 'fc3': tf.Variable(tf.truncated_normal([n_fc2, n_classes], stddev=0.1)), } b_conv = { 'conv1': tf.Variable(tf.constant(0.0, dtype=tf.float32, shape=[96])), 'conv2': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[256])), 'conv3': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[384])), 'conv4': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[384])), 'conv5': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[256])), 'fc1': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[n_fc1])), 'fc2': tf.Variable(tf.constant(0.1, dtype=tf.float32, shape=[n_fc2])), 'fc3': tf.Variable(tf.constant(0.0, dtype=tf.float32, shape=[n_classes])), } x_image = tf.reshape(x, [-1, 227, 227, 3]) # 卷积层,池化层,LRN层编写 # 第一层卷积层 # 卷积层1 conv1 = tf.nn.conv2d(x_image, W_conv['conv1'], strides=[1, 4, 4, 1], padding='VALID') conv1 = tf.nn.bias_add(conv1, b_conv['conv1']) conv1 = batch_norm(conv1, is_training) #conv1 = tf.layers.batch_normalization(conv1, training=is_training) conv1 = tf.nn.relu(conv1) # 池化层1 pool1 = tf.nn.avg_pool(conv1, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='VALID') # LRN层 norm1 = tf.nn.lrn(pool1, 5, bias=1.0, alpha=0.001 / 9.0, beta=0.75) # 第二层卷积 # 卷积层2 conv2 = tf.nn.conv2d(norm1, W_conv['conv2'], strides=[1, 1, 1, 1], padding='SAME') conv2 = tf.nn.bias_add(conv2, b_conv['conv2']) #conv2 = tf.layers.batch_normalization(conv2, training=is_training) conv2 = batch_norm(conv2, is_training) conv2 = tf.nn.relu(conv2) # 池化层2 pool2 = tf.nn.avg_pool(conv2, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='VALID') # LRN层 #norm2 = tf.nn.lrn(pool2, 5, bias=1.0, alpha=0.001 / 9.0, beta=0.75) # 第三层卷积 # 卷积层3 conv3 = tf.nn.conv2d(pool2, W_conv['conv3'], strides=[1, 1, 1, 1], padding='SAME') conv3 = tf.nn.bias_add(conv3, b_conv['conv3']) #conv3 = tf.layers.batch_normalization(conv3, training=is_training) conv3 = batch_norm(conv3, is_training) conv3 = tf.nn.relu(conv3) # 第四层卷积 # 卷积层4 conv4 = tf.nn.conv2d(conv3, W_conv['conv4'], strides=[1, 1, 1, 1], padding='SAME') conv4 = tf.nn.bias_add(conv4, b_conv['conv4']) #conv4 = tf.layers.batch_normalization(conv4, training=is_training) conv4 = batch_norm(conv4, is_training) conv4 = tf.nn.relu(conv4) # 第五层卷积 # 卷积层5 conv5 = tf.nn.conv2d(conv4, W_conv['conv5'], strides=[1, 1, 1, 1], padding='SAME') conv5 = tf.nn.bias_add(conv5, b_conv['conv5']) #conv5 = tf.layers.batch_normalization(conv5, training=is_training) conv5 = batch_norm(conv5, is_training) conv5 = tf.nn.relu(conv5) # 池化层5 pool5 = tf.nn.avg_pool(conv5, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='VALID') # 第六层全连接 reshape = tf.reshape(pool5, [-1, 6 * 6 * 256]) #fc1 = tf.matmul(reshape, W_conv['fc1']) fc1 = tf.add(tf.matmul(reshape, W_conv['fc1']), b_conv['fc1']) #fc1 = tf.layers.batch_normalization(fc1, training=is_training) fc1 = batch_norm(fc1, is_training, False) fc1 = tf.nn.relu(fc1) #fc1 = tf.nn.dropout(fc1, 0.5) # 第七层全连接 #fc2 = tf.matmul(fc1, W_conv['fc2']) fc2 = tf.add(tf.matmul(fc1, W_conv['fc2']), b_conv['fc2']) #fc2 = tf.layers.batch_normalization(fc2, training=is_training) fc2 = batch_norm(fc2, is_training, False) fc2 = tf.nn.relu(fc2) #fc2 = tf.nn.dropout(fc2, 0.5) # 第八层全连接(分类层) yop = tf.add(tf.matmul(fc2, W_conv['fc3']), b_conv['fc3']) # 损失函数 #y = tf.stop_gradient(y) loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=yop, labels=y)) optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss) #update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) #with tf.control_dependencies(update_ops): # 保证train_op在update_ops执行之后再执行。 #train_op = optimizer.minimize(loss) # 评估模型 correct_predict = tf.nn.in_top_k(yop, y, 1) accuracy = tf.reduce_mean(tf.cast(correct_predict, tf.float32)) init = tf.global_variables_initializer() def onehot(labels): # 独热编码表示数据 n_sample = len(labels) n_class = max(labels) + 1 onehot_labels = np.zeros((n_sample, n_class)) onehot_labels[np.arange(n_sample), labels] = 1 # python迭代方法,将每一行对应个置1 return onehot_labels save_model = './/model//my-model.ckpt' # 模型训练 def train(epoch): with tf.Session() as sess: sess.run(init) saver = tf.train.Saver(var_list=tf.global_variables()) c = [] b = [] max_acc = 0 start_time = time.time() step = 0 global dataset dataset = dataset.get_next() for i in range(epoch): step = i image, labels = sess.run(dataset) sess.run(optimizer, feed_dict={x: image, y: labels, is_training: True}) # 训练一次 #if i % 5 == 0: loss_record = sess.run(loss, feed_dict={x: image, y: labels, is_training: True}) # 记录一次 #predict = sess.run(yop, feed_dict={x: image, y: labels, is_training: True}) acc = sess.run(accuracy, feed_dict={x: image, y: labels, is_training: True}) print("step:%d, now the loss is %f" % (step, loss_record)) #print(predict[0]) print("acc : %f" % acc) c.append(loss_record) b.append(acc) end_time = time.time() print('time:', (end_time - start_time)) start_time = end_time print('-----------%d opench is finished ------------' % (i / 5)) #if acc > max_acc: # max_acc = acc # saver.save(sess, save_model, global_step=i + 1) print('Optimization Finished!') #saver.save(sess, save_model) print('Model Save Finished!') plt.plot(c) plt.plot(b) plt.xlabel('iter') plt.ylabel('loss') plt.title('lr=%f, ti=%d, bs=%d' % (learning_rate, training_iters, batch_size)) plt.tight_layout() plt.show() X_train, y_train = get_file("D://cat_and_dog//cat_dog_train//cat_dog") # 返回为文件地址 dataset = get_batch(X_train, y_train, 100) train(100) ``` 数据文件夹为猫狗大战那个25000个图片的文件,不加入正则表达层的时候训练集loss会下降,但是acc维持不变,加入__batch norm__或者__tf.layers.batch__normalization 训练集和验证机的loss都不收敛了

tensorflow载入训练好的模型进行预测,同一张图片预测的结果却不一样????

最近在跑deeplabv1,在测试代码的时候,跑通了训练程序,但是用训练好的模型进行与测试却发现相同的图片预测的结果不一样??请问有大神知道怎么回事吗? 用的是saver.restore()方法载入模型。代码如下: ``` def main(): """Create the model and start the inference process.""" args = get_arguments() # Prepare image. img = tf.image.decode_jpeg(tf.read_file(args.img_path), channels=3) # Convert RGB to BGR. img_r, img_g, img_b = tf.split(value=img, num_or_size_splits=3, axis=2) img = tf.cast(tf.concat(axis=2, values=[img_b, img_g, img_r]), dtype=tf.float32) # Extract mean. img -= IMG_MEAN # Create network. net = DeepLabLFOVModel() # Which variables to load. trainable = tf.trainable_variables() # Predictions. pred = net.preds(tf.expand_dims(img, dim=0)) # Set up TF session and initialize variables. config = tf.ConfigProto() config.gpu_options.allow_growth = True sess = tf.Session(config=config) #init = tf.global_variables_initializer() sess.run(tf.global_variables_initializer()) # Load weights. saver = tf.train.Saver(var_list=trainable) load(saver, sess, args.model_weights) # Perform inference. preds = sess.run([pred]) print(preds) if not os.path.exists(args.save_dir): os.makedirs(args.save_dir) msk = decode_labels(np.array(preds)[0, 0, :, :, 0]) im = Image.fromarray(msk) im.save(args.save_dir + 'mask1.png') print('The output file has been saved to {}'.format( args.save_dir + 'mask.png')) if __name__ == '__main__': main() ``` 其中load是 ``` def load(saver, sess, ckpt_path): '''Load trained weights. Args: saver: TensorFlow saver object. sess: TensorFlow session. ckpt_path: path to checkpoint file with parameters. ''' ckpt = tf.train.get_checkpoint_state(ckpt_path) if ckpt and ckpt.model_checkpoint_path: saver.restore(sess, ckpt.model_checkpoint_path) print("Restored model parameters from {}".format(ckpt_path)) ``` DeepLabLFOVMode类如下: ``` class DeepLabLFOVModel(object): """DeepLab-LargeFOV model with atrous convolution and bilinear upsampling. This class implements a multi-layer convolutional neural network for semantic image segmentation task. This is the same as the model described in this paper: https://arxiv.org/abs/1412.7062 - please look there for details. """ def __init__(self, weights_path=None): """Create the model. Args: weights_path: the path to the cpkt file with dictionary of weights from .caffemodel. """ self.variables = self._create_variables(weights_path) def _create_variables(self, weights_path): """Create all variables used by the network. This allows to share them between multiple calls to the loss function. Args: weights_path: the path to the ckpt file with dictionary of weights from .caffemodel. If none, initialise all variables randomly. Returns: A dictionary with all variables. """ var = list() index = 0 if weights_path is not None: with open(weights_path, "rb") as f: weights = cPickle.load(f) # Load pre-trained weights. for name, shape in net_skeleton: var.append(tf.Variable(weights[name], name=name)) del weights else: # Initialise all weights randomly with the Xavier scheme, # and # all biases to 0's. for name, shape in net_skeleton: if "/w" in name: # Weight filter. w = create_variable(name, list(shape)) var.append(w) else: b = create_bias_variable(name, list(shape)) var.append(b) return var def _create_network(self, input_batch, keep_prob): """Construct DeepLab-LargeFOV network. Args: input_batch: batch of pre-processed images. keep_prob: probability of keeping neurons intact. Returns: A downsampled segmentation mask. """ current = input_batch v_idx = 0 # Index variable. # Last block is the classification layer. for b_idx in xrange(len(dilations) - 1): for l_idx, dilation in enumerate(dilations[b_idx]): w = self.variables[v_idx * 2] b = self.variables[v_idx * 2 + 1] if dilation == 1: conv = tf.nn.conv2d(current, w, strides=[ 1, 1, 1, 1], padding='SAME') else: conv = tf.nn.atrous_conv2d( current, w, dilation, padding='SAME') current = tf.nn.relu(tf.nn.bias_add(conv, b)) v_idx += 1 # Optional pooling and dropout after each block. if b_idx < 3: current = tf.nn.max_pool(current, ksize=[1, ks, ks, 1], strides=[1, 2, 2, 1], padding='SAME') elif b_idx == 3: current = tf.nn.max_pool(current, ksize=[1, ks, ks, 1], strides=[1, 1, 1, 1], padding='SAME') elif b_idx == 4: current = tf.nn.max_pool(current, ksize=[1, ks, ks, 1], strides=[1, 1, 1, 1], padding='SAME') current = tf.nn.avg_pool(current, ksize=[1, ks, ks, 1], strides=[1, 1, 1, 1], padding='SAME') elif b_idx <= 6: current = tf.nn.dropout(current, keep_prob=keep_prob) # Classification layer; no ReLU. # w = self.variables[v_idx * 2] w = create_variable(name='w', shape=[1, 1, 1024, n_classes]) # b = self.variables[v_idx * 2 + 1] b = create_bias_variable(name='b', shape=[n_classes]) conv = tf.nn.conv2d(current, w, strides=[1, 1, 1, 1], padding='SAME') current = tf.nn.bias_add(conv, b) return current def prepare_label(self, input_batch, new_size): """Resize masks and perform one-hot encoding. Args: input_batch: input tensor of shape [batch_size H W 1]. new_size: a tensor with new height and width. Returns: Outputs a tensor of shape [batch_size h w 18] with last dimension comprised of 0's and 1's only. """ with tf.name_scope('label_encode'): # As labels are integer numbers, need to use NN interp. input_batch = tf.image.resize_nearest_neighbor( input_batch, new_size) # Reducing the channel dimension. input_batch = tf.squeeze(input_batch, squeeze_dims=[3]) input_batch = tf.one_hot(input_batch, depth=n_classes) return input_batch def preds(self, input_batch): """Create the network and run inference on the input batch. Args: input_batch: batch of pre-processed images. Returns: Argmax over the predictions of the network of the same shape as the input. """ raw_output = self._create_network( tf.cast(input_batch, tf.float32), keep_prob=tf.constant(1.0)) raw_output = tf.image.resize_bilinear( raw_output, tf.shape(input_batch)[1:3, ]) raw_output = tf.argmax(raw_output, dimension=3) raw_output = tf.expand_dims(raw_output, dim=3) # Create 4D-tensor. return tf.cast(raw_output, tf.uint8) def loss(self, img_batch, label_batch): """Create the network, run inference on the input batch and compute loss. Args: input_batch: batch of pre-processed images. Returns: Pixel-wise softmax loss. """ raw_output = self._create_network( tf.cast(img_batch, tf.float32), keep_prob=tf.constant(0.5)) prediction = tf.reshape(raw_output, [-1, n_classes]) # Need to resize labels and convert using one-hot encoding. label_batch = self.prepare_label( label_batch, tf.stack(raw_output.get_shape()[1:3])) gt = tf.reshape(label_batch, [-1, n_classes]) # Pixel-wise softmax loss. loss = tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=gt) reduced_loss = tf.reduce_mean(loss) return reduced_loss ``` 按理说载入模型应该没有问题,可是不知道为什么结果却不一样? 图片:![图片说明](https://img-ask.csdn.net/upload/201911/15/1573810836_83106.jpg) ![图片说明](https://img-ask.csdn.net/upload/201911/15/1573810850_924663.png) 预测的结果: ![图片说明](https://img-ask.csdn.net/upload/201911/15/1573810884_985680.png) ![图片说明](https://img-ask.csdn.net/upload/201911/15/1573810904_577649.png) 两次结果不一样,与保存的模型算出来的结果也不一样。 我用的是GitHub上这个人的代码: https://github.com/minar09/DeepLab-LFOV-TensorFlow 急急急,请问有大神知道吗???

tensorflow cifar10教程如何实现断点续训?我希望每次能从上次的结果继续训练

cifar10_train.py FLAGS = tf.app.flags.FLAGS tf.app.flags.DEFINE_string('train_dir', 'D:/tmp/cifar10_trainn', """Directory where to write event logs """ """and checkpoint.""") tf.app.flags.DEFINE_integer('max_steps', 100000, """Number of batches to run.""") tf.app.flags.DEFINE_boolean('log_device_placement', False, """Whether to log device placement.""") tf.app.flags.DEFINE_integer('log_frequency', 10, """How often to log results to the console.""") def train(): """Train CIFAR-10 for a number of steps.""" with tf.Graph().as_default(): global_step = tf.train.get_or_create_global_step() # Get images and labels for CIFAR-10. # Force input pipeline to CPU:0 to avoid operations sometimes ending up on # GPU and resulting in a slow down. with tf.device('/cpu:0'): images, labels = cifar10.distorted_inputs() # Build a Graph that computes the logits predictions from the # inference model. logits = cifar10.inference(images) # Calculate loss. loss = cifar10.loss(logits, labels) # Build a Graph that trains the model with one batch of examples and # updates the model parameters. train_op = cifar10.train(loss, global_step) class _LoggerHook(tf.train.SessionRunHook): """Logs loss and runtime.""" def begin(self): self._step = -1 self._start_time = time.time() def before_run(self, run_context): self._step += 1 return tf.train.SessionRunArgs(loss) # Asks for loss value. def after_run(self, run_context, run_values): if self._step % FLAGS.log_frequency == 0: current_time = time.time() duration = current_time - self._start_time self._start_time = current_time loss_value = run_values.results examples_per_sec = FLAGS.log_frequency * FLAGS.batch_size / duration sec_per_batch = float(duration / FLAGS.log_frequency) format_str = ('%s: step %d, loss = %.2f (%.1f examples/sec; %.3f ' 'sec/batch)') print (format_str % (datetime.now(), self._step, loss_value, examples_per_sec, sec_per_batch)) saver = tf.train.Saver() with tf.train.MonitoredTrainingSession( checkpoint_dir=FLAGS.train_dir, hooks=[tf.train.StopAtStepHook(last_step=FLAGS.max_steps), tf.train.NanTensorHook(loss), _LoggerHook()], config=tf.ConfigProto( log_device_placement=FLAGS.log_device_placement)) as mon_sess: while not mon_sess.should_stop(): mon_sess.run(train_op) def main(argv=None): # pylint: disable=unused-argument cifar10.maybe_download_and_extract() if tf.gfile.Exists(FLAGS.train_dir): tf.gfile.DeleteRecursively(FLAGS.train_dir) tf.gfile.MakeDirs(FLAGS.train_dir) train() if __name__ == '__main__': tf.app.run()

tensorflow debug 调试错误

# -*- coding: utf-8 -*- """ Created on Sat Oct 28 15:40:51 2017 @author: Administrator """ import tensorflow as tf from tensorflow.python import debug as tf_debug '''载入数据''' from aamodule import input_data mnist = input_data.read_data_sets('d://MNIST',one_hot=True) '''构建回归模型''' #定义回归模型 x = tf.placeholder(tf.float32,[None,784]) y = tf.placeholder(tf.float32,[None,10]) W = tf.Variable(tf.zeros([784,10])) b = tf.Variable(tf.zeros([10])) y_ = tf.matmul(x,W) + b #预测值 #定义损失函数和优化器 cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=y_,labels=y) train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy) '''训练模型''' sess = tf.InteractiveSession() sess.run(tf.global_variables_initializer()) sess = tf_debug.LocalCLIDebugWrapperSession(sess,ui_type='readline') #sess.add_tensor_filter("has_inf_or_nan", tf_debug.has_inf_or_nan) #Train for i in range(1000): batch_xs,batch_ys = mnist.train.next_batch(100) sess.run(train_step,feed_dict={x:batch_xs,y:batch_ys}) #评估训练好的模型 correct_prediction = tf.equal(tf.argmax(y_,1),tf.argmax(y,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32)) #计算模型在测试集上的准确率 print(sess.run(accuracy,feed_dict={x:mnist.test.images,y:mnist.test.labels})) 加入sess = tf_debug.LocalCLIDebugWrapperSession(sess,ui_type='readline')后就运行不了了,ValueError: Exhausted all fallback ui_types. ``` ```

tensorflow RNN LSTM代码运行不正确?

报错显示是ValueError: None values not supported. 在cross_entropy处有问题。谢谢大家 ``` #7.2 RNN import tensorflow as tf #tf.reset_default_graph() from tensorflow.examples.tutorials.mnist import input_data #载入数据集 mnist = input_data.read_data_sets("MNIST_data/", one_hot = True) #输入图片是28*28 n_inputs = 28 #输入一行,一行有28个数据 max_time = 28 #一共28行 lstm_size = 100 #隐层单元 n_classes = 10 #10个分量 batch_size = 50 #每批次50个样本 n_batch = mnist.train.num_examples//batch_size #计算共由多少个批次 #这里的none表示第一个维度可以是任意长度 x = tf.placeholder(tf.float32, [batch_size, 784]) #正确的标签 y = tf.placeholder(tf.float32, [batch_size, 10]) #初始化权值 weights = tf.Variable(tf.truncated_normal([lstm_size, n_classes], stddev = 0.1)) #初始化偏置 biases = tf.Variable(tf.constant(0.1, shape = [n_classes])) #定义RNN网络 def RNN(X, weights, biases): #input = [batch_size, max_size, n_inputs] inputs = tf.reshape(X, [-1, max_time, n_inputs]) #定义LSTM基本CELL lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(lstm_size) #final_state[0]是cell_state #final_state[1]是hidden_state outputs, final_state = tf.nn.dynamic_rnn(lstm_cell, inputs, dtype = tf.float32) results = tf.nn.softmax(tf.matmul(final_state[1], weights) + biases) #计算RNN的返回结果 prediction = RNN(x, weights, biases) #损失函数 cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels = y,logits = prediction)) #使用AdamOptimizer进行优化 train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) #结果存放在一个布尔型列表中 correct_prediction = tf.equal(tf.argmax(y, 1),tf.argmax(prediction, 1)) #求准确率 accuracy = tf.reduce_mean(tf.cast(correct_precdition,tf.float32)) #初始化 init = tf.global_variable_initializer() with tf.Session() as sess: sess.run(init) for epoch in range(6): for batch in range(n_batch): batch_xs,batch_ys=mnist.train.next_batch(batch_size) sess.run(train_step,feed_dict={x:batch_xs,y:batch_ys}) acc = sess.run(accuracy, feed_dict={x:mnist.test.images,y:mnist.test.labels}) print('Iter' + str(epoch) + ',Testing Accuracy = ' + str(acc)) ```

(Tensorflow) 在读取文件后,如何将global_step变为0

这是cifar_10的代码,我想每次训练开始的时候global_step都是0,而不是从文件中读到的之前训练留下来的步数,对于MonitoredTrainingSession不是懂,有没有大佬教教我,感谢了! ``` def train(max_step, n): """Train CIFAR-10 for a number of steps.""" """创建图""" with tf.Graph().as_default(): global_step = tf.contrib.framework.get_or_create_global_step() # Get images and labels for CIFAR-10. # Force input pipeline to CPU:0 to avoid operations sometimes ending up on # GPU and resulting in a slow down. with tf.device('/cpu:0'): images, labels = cifar10.distorted_inputs(n) #使用cpu运行 #到了cifar10里面的image以及FLAGS.batch_size=128 #images=/tmp/cifar10_data/cifar-10-batches-bin #labels=128 # Build a Graph that computes the logits predictions from the # inference model. file_output_path=FLAGS.train_dir file_output_path=os.path.join(file_output_path,"train_output.txt") file_output=open(file_output_path,"a") """初始化logits(预测值)以及训练""" logits = cifar10.inference(images) loss = cifar10.loss(logits, labels) train_op = cifar10.train(loss, global_step) # Build a Graph that trains the model with one batch of examples and # updates the model parameters. class _LoggerHook(tf.train.SessionRunHook): """Logs loss and runtime.""" def begin(self): self._step = -1 self._start_time = time.time() def before_run(self, run_context): self._step += 1 return tf.train.SessionRunArgs(loss) # Asks for loss value. def after_run(self, run_context, run_values): if self._step % FLAGS.log_frequency == 0: current_time = time.time() duration = current_time - self._start_time self._start_time = current_time loss_value = run_values.results examples_per_sec = FLAGS.log_frequency * FLAGS.batch_size / duration sec_per_batch = float(duration / FLAGS.log_frequency) format_str = ('whichdata:%d %s: step %d, loss = %.2f (%.1f examples/sec; %.3f ' 'sec/batch)') print (format_str % (n, datetime.now(), self._step, loss_value, examples_per_sec, sec_per_batch)) """ ///////////////// """ file_output=open(file_output_path,"a") file_output.write(format_str % (n, datetime.now(), self._step, loss_value, examples_per_sec, sec_per_batch)+'\n') with tf.train.MonitoredTrainingSession( checkpoint_dir=FLAGS.train_dir, hooks=[tf.train.StopAtStepHook(last_step=max_step), tf.train.NanTensorHook(loss), _LoggerHook()], config=tf.ConfigProto( log_device_placement=FLAGS.log_device_placement)) as mon_sess: while not mon_sess.should_stop(): mon_sess.run(train_op) ```

tensorflow训练完模型直接测试和导入模型进行测试的结果不同,一个很好,一个略差,这是为什么?

在tensorflow训练完模型,我直接采用同一个session进行测试,得到结果较好,但是采用训练完保存的模型,进行重新载入进行测试,结果较差,不懂是为什么会出现这样的结果。注:测试数据是一样的。以下是模型结果: 训练集:loss:0.384,acc:0.931. 验证集:loss:0.212,acc:0.968. 训练完在同一session内的测试集:acc:0.96。导入保存的模型进行测试:acc:0.29 ``` def create_model(hps): global_step = tf.Variable(tf.zeros([], tf.float64), name = 'global_step', trainable = False) scale = 1.0 / math.sqrt(hps.num_embedding_size + hps.num_lstm_nodes[-1]) / 3.0 print(type(scale)) gru_init = tf.random_normal_initializer(-scale, scale) with tf.variable_scope('Bi_GRU_nn', initializer = gru_init): for i in range(hps.num_lstm_layers): cell_bw = tf.contrib.rnn.GRUCell(hps.num_lstm_nodes[i], activation = tf.nn.relu, name = 'cell-bw') cell_bw = tf.contrib.rnn.DropoutWrapper(cell_bw, output_keep_prob = dropout_keep_prob) cell_fw = tf.contrib.rnn.GRUCell(hps.num_lstm_nodes[i], activation = tf.nn.relu, name = 'cell-fw') cell_fw = tf.contrib.rnn.DropoutWrapper(cell_fw, output_keep_prob = dropout_keep_prob) rnn_outputs, _ = tf.nn.bidirectional_dynamic_rnn(cell_bw, cell_fw, inputs, dtype=tf.float32) embeddedWords = tf.concat(rnn_outputs, 2) finalOutput = embeddedWords[:, -1, :] outputSize = hps.num_lstm_nodes[-1] * 2 # 因为是双向LSTM,最终的输出值是fw和bw的拼接,因此要乘以2 last = tf.reshape(finalOutput, [-1, outputSize]) # reshape成全连接层的输入维度 last = tf.layers.batch_normalization(last, training = is_training) fc_init = tf.uniform_unit_scaling_initializer(factor = 1.0) with tf.variable_scope('fc', initializer = fc_init): fc1 = tf.layers.dense(last, hps.num_fc_nodes, name = 'fc1') fc1_batch_normalization = tf.layers.batch_normalization(fc1, training = is_training) fc_activation = tf.nn.relu(fc1_batch_normalization) logits = tf.layers.dense(fc_activation, hps.num_classes, name = 'fc2') with tf.name_scope('metrics'): softmax_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits = logits, labels = tf.argmax(outputs, 1)) loss = tf.reduce_mean(softmax_loss) # [0, 1, 5, 4, 2] ->argmax:2 因为在第二个位置上是最大的 y_pred = tf.argmax(tf.nn.softmax(logits), 1, output_type = tf.int64, name = 'y_pred') # 计算准确率,看看算对多少个 correct_pred = tf.equal(tf.argmax(outputs, 1), y_pred) # tf.cast 将数据转换成 tf.float32 类型 accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) with tf.name_scope('train_op'): tvar = tf.trainable_variables() for var in tvar: print('variable name: %s' % (var.name)) grads, _ = tf.clip_by_global_norm(tf.gradients(loss, tvar), hps.clip_lstm_grads) optimizer = tf.train.AdamOptimizer(hps.learning_rate) train_op = optimizer.apply_gradients(zip(grads, tvar), global_step) # return((inputs, outputs, is_training), (loss, accuracy, y_pred), (train_op, global_step)) return((inputs, outputs), (loss, accuracy, y_pred), (train_op, global_step)) placeholders, metrics, others = create_model(hps) content, labels = placeholders loss, accuracy, y_pred = metrics train_op, global_step = others def val_steps(sess, x_batch, y_batch, writer = None): loss_val, accuracy_val = sess.run([loss,accuracy], feed_dict = {inputs: x_batch, outputs: y_batch, is_training: hps.val_is_training, dropout_keep_prob: 1.0}) return loss_val, accuracy_val loss_summary = tf.summary.scalar('loss', loss) accuracy_summary = tf.summary.scalar('accuracy', accuracy) # 将所有的变量都集合起来 merged_summary = tf.summary.merge_all() # 用于test测试的summary merged_summary_test = tf.summary.merge([loss_summary, accuracy_summary]) LOG_DIR = '.' run_label = 'run_Bi-GRU_Dropout_tensorboard' run_dir = os.path.join(LOG_DIR, run_label) if not os.path.exists(run_dir): os.makedirs(run_dir) train_log_dir = os.path.join(run_dir, timestamp, 'train') test_los_dir = os.path.join(run_dir, timestamp, 'test') if not os.path.exists(train_log_dir): os.makedirs(train_log_dir) if not os.path.join(test_los_dir): os.makedirs(test_los_dir) # saver得到的文件句柄,可以将文件训练的快照保存到文件夹中去 saver = tf.train.Saver(tf.global_variables(), max_to_keep = 5) # train 代码 init_op = tf.global_variables_initializer() train_keep_prob_value = 0.2 test_keep_prob_value = 1.0 # 由于如果按照每一步都去计算的话,会很慢,所以我们规定每100次存储一次 output_summary_every_steps = 100 num_train_steps = 1000 # 每隔多少次保存一次 output_model_every_steps = 500 # 测试集测试 test_model_all_steps = 4000 i = 0 session_conf = tf.ConfigProto( gpu_options = tf.GPUOptions(allow_growth=True), allow_soft_placement = True, log_device_placement = False) with tf.Session(config = session_conf) as sess: sess.run(init_op) # 将训练过程中,将loss,accuracy写入文件里,后面是目录和计算图,如果想要在tensorboard中显示计算图,就想sess.graph加上 train_writer = tf.summary.FileWriter(train_log_dir, sess.graph) # 同样将测试的结果保存到tensorboard中,没有计算图 test_writer = tf.summary.FileWriter(test_los_dir) batches = batch_iter(list(zip(x_train, y_train)), hps.batch_size, hps.num_epochs) for batch in batches: train_x, train_y = zip(*batch) eval_ops = [loss, accuracy, train_op, global_step] should_out_summary = ((i + 1) % output_summary_every_steps == 0) if should_out_summary: eval_ops.append(merged_summary) # 那三个占位符输进去 # 计算loss, accuracy, train_op, global_step的图 eval_ops.append(merged_summary) outputs_train = sess.run(eval_ops, feed_dict={ inputs: train_x, outputs: train_y, dropout_keep_prob: train_keep_prob_value, is_training: hps.train_is_training }) loss_train, accuracy_train = outputs_train[0:2] if should_out_summary: # 由于我们想在100steps之后计算summary,所以上面 should_out_summary = ((i + 1) % output_summary_every_steps == 0)成立, # 即为真True,那么我们将训练的内容放入eval_ops的最后面了,因此,我们想获得summary的结果得在eval_ops_results的最后一个 train_summary_str = outputs_train[-1] # 将获得的结果写训练tensorboard文件夹中,由于训练从0开始,所以这里加上1,表示第几步的训练 train_writer.add_summary(train_summary_str, i + 1) test_summary_str = sess.run([merged_summary_test], feed_dict = {inputs: x_dev, outputs: y_dev, dropout_keep_prob: 1.0, is_training: hps.val_is_training })[0] test_writer.add_summary(test_summary_str, i + 1) current_step = tf.train.global_step(sess, global_step) if (i + 1) % 100 == 0: print("Step: %5d, loss: %3.3f, accuracy: %3.3f" % (i + 1, loss_train, accuracy_train)) # 500个batch校验一次 if (i + 1) % 500 == 0: loss_eval, accuracy_eval = val_steps(sess, x_dev, y_dev) print("Step: %5d, val_loss: %3.3f, val_accuracy: %3.3f" % (i + 1, loss_eval, accuracy_eval)) if (i + 1) % output_model_every_steps == 0: path = saver.save(sess,os.path.join(out_dir, 'ckp-%05d' % (i + 1))) print("Saved model checkpoint to {}\n".format(path)) print('model saved to ckp-%05d' % (i + 1)) if (i + 1) % test_model_all_steps == 0: # test_loss, test_acc, all_predictions= sess.run([loss, accuracy, y_pred], feed_dict = {inputs: x_test, outputs: y_test, dropout_keep_prob: 1.0}) test_loss, test_acc, all_predictions= sess.run([loss, accuracy, y_pred], feed_dict = {inputs: x_test, outputs: y_test, is_training: hps.val_is_training, dropout_keep_prob: 1.0}) print("test_loss: %3.3f, test_acc: %3.3d" % (test_loss, test_acc)) batches = batch_iter(list(x_test), 128, 1, shuffle=False) # Collect the predictions here all_predictions = [] for x_test_batch in batches: batch_predictions = sess.run(y_pred, {inputs: x_test_batch, is_training: hps.val_is_training, dropout_keep_prob: 1.0}) all_predictions = np.concatenate([all_predictions, batch_predictions]) correct_predictions = float(sum(all_predictions == y.flatten())) print("Total number of test examples: {}".format(len(y_test))) print("Accuracy: {:g}".format(correct_predictions/float(len(y_test)))) test_y = y_test.argmax(axis = 1) #生成混淆矩阵 conf_mat = confusion_matrix(test_y, all_predictions) fig, ax = plt.subplots(figsize = (4,2)) sns.heatmap(conf_mat, annot=True, fmt = 'd', xticklabels = cat_id_df.category_id.values, yticklabels = cat_id_df.category_id.values) font_set = FontProperties(fname = r"/usr/share/fonts/truetype/wqy/wqy-microhei.ttc", size=15) plt.ylabel(u'实际结果',fontsize = 18,fontproperties = font_set) plt.xlabel(u'预测结果',fontsize = 18,fontproperties = font_set) plt.savefig('./test.png') print('accuracy %s' % accuracy_score(all_predictions, test_y)) print(classification_report(test_y, all_predictions,target_names = cat_id_df['category_name'].values)) print(classification_report(test_y, all_predictions)) i += 1 ``` 以上的模型代码,请求各位大神帮我看看,为什么出现这样的结果?

tensorflow怎么获取最后一个维度特定索引的张量?

我有一个TensorFlow张量[1,513,513,21],其中的513,513分别表示模型预测的特征图的宽度和高度,21表示每个像素预测的21个类别的置信度,之前用tf.argmax返回的是最后一个维度,也就是21这个维度上最大那个置信度的值,给到每个像素点对应。可我现在想要输出每一个类别的,比如说每个像素对应给出0类别识别那个值,这样就有21个特征图出来,而不是一张。 ``` predictions[output] = tf.argmax(logits, 3) ```

TensorFlow训练过程中几步训练损失和准确率就一直不变了

![图片说明](https://img-ask.csdn.net/upload/202003/16/1584323271_696485.png) 从这步开始训练就一直不变了,很奇怪 train部分代码: ```# 导入文件 import os import numpy as np import tensorflow as tf import input_data import model # 变量声明 N_CLASSES = 4 # 四种花类型 IMG_W = 64 # resize图像,太大的话训练时间久 IMG_H = 64 BATCH_SIZE = 20 CAPACITY = 200 MAX_STEP = 10000 # 一般大于10K learning_rate = 0.00005 # 一般小于0.0001 # 获取批次batch train_dir = 'D:/python/four_flower-master 5/input_data' # 训练样本的读入路径 logs_train_dir = 'D:/python/four_flower-master 5/save' # logs存储路径 # train, train_label = input_data.get_files(train_dir) train, train_label, val, val_label = input_data.get_files(train_dir, 0.3) # 训练数据及标签 train_batch, train_label_batch = input_data.get_batch(train, train_label, IMG_W, IMG_H, BATCH_SIZE, CAPACITY) # 测试数据及标签 val_batch, val_label_batch = input_data.get_batch(val, val_label, IMG_W, IMG_H, BATCH_SIZE, CAPACITY) # 训练操作定义 train_logits = model.inference(train_batch, BATCH_SIZE, N_CLASSES) train_loss = model.losses(train_logits, train_label_batch) train_op = model.trainning(train_loss, learning_rate) train_acc = model.evaluation(train_logits, train_label_batch) # 测试操作定义 test_logits = model.inference(val_batch, BATCH_SIZE, N_CLASSES) test_loss = model.losses(test_logits, val_label_batch) test_acc = model.evaluation(test_logits, val_label_batch) # 这个是log汇总记录 summary_op = tf.summary.merge_all() # 产生一个会话 sess = tf.Session() # 产生一个writer来写log文件 train_writer = tf.summary.FileWriter(logs_train_dir, sess.graph) # val_writer = tf.summary.FileWriter(logs_test_dir, sess.graph) # 产生一个saver来存储训练好的模型 saver = tf.train.Saver() # 所有节点初始化 sess.run(tf.global_variables_initializer()) # 队列监控 coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(sess=sess, coord=coord) # 进行batch的训练 try: # 执行MAX_STEP步的训练,一步一个batch for step in np.arange(MAX_STEP): if coord.should_stop(): break _, tra_loss, tra_acc = sess.run([train_op, train_loss, train_acc]) # 每隔50步打印一次当前的loss以及acc,同时记录log,写入writer if step % 10 == 0: print('Step %d, train loss = %.2f, train accuracy = %.2f%%' % (step, tra_loss, tra_acc * 100.0)) summary_str = sess.run(summary_op) train_writer.add_summary(summary_str, step) # 每隔100步,保存一次训练好的模型 if (step + 1) == MAX_STEP: checkpoint_path = os.path.join(logs_train_dir, 'model.ckpt') saver.save(sess, checkpoint_path, global_step=step) except tf.errors.OutOfRangeError: print('Done training -- epoch limit reached') finally: coord.request_stop() ``` ``` ``` ``` ``` ``` ```

tensorflow.python.framework.errors_impl.InvalidArgumentError

tensorflow.python.framework.errors_impl.InvalidArgumentError: Key: image/encoded. Can't parse serialized Example. [[Node: ParseSingleExample/ParseSingleExample = ParseSingleExample[Tdense=[DT_STRING, DT_INT64], dense_keys=["image/encoded", "image/label"], dense_shapes=[[8], []], num_sparse=0, sparse_keys=[], sparse_types=[]](arg0, ParseSingleExample/Const, ParseSingleExample/Const_1)]] [[Node: IteratorGetNext = IteratorGetNext[output_shapes=[[?,8,?,?,3], [?]], output_types=[DT_FLOAT, DT_INT64], _device="/job:localhost/replica:0/task:0/device:CPU:0"](Iterator)]] 请问这是什么原因啊?是我的数据集不对吗?

关于Tensorflow的DNN分类器

用Tensorflow写了一个简易的DNN网络(输入,一个隐层,输出),用作分类,数据集选用的是UCI 的iris数据集 激活函数使用softmax loss函数使用对数似然 以便最后的结果是一个概率解,选概率最大的分类的结果 目前的问题是预测结果出现问题,用测试数据测试显示结果如下 ![图片说明](https://img-ask.csdn.net/upload/201811/27/1543322274_512329.png) 刚刚入门...希望大家指点一下,谢谢辣! ``` #coding:utf-8 import matplotlib.pyplot as plt import tensorflow as tf import numpy as np import pandas as pd from sklearn.model_selection import train_test_split from sklearn.decomposition import PCA from sklearn import preprocessing from sklearn.model_selection import cross_val_score BATCH_SIZE = 30 iris = pd.read_csv('F:\dataset\iris\dataset.data', sep=',', header=None) ''' # 查看导入的数据 print("Dataset Lenght:: ", len(iris)) print("Dataset Shape:: ", iris.shape) print("Dataset:: ") print(iris.head(150)) ''' #将每条数据划分为样本值和标签值 X = iris.values[:, 0:4] Y = iris.values[:, 4] # 整理一下标签数据 # Iris-setosa ---> 0 # Iris-versicolor ---> 1 # Iris-virginica ---> 2 for i in range(len(Y)): if Y[i] == 'Iris-setosa': Y[i] = 0 elif Y[i] == 'Iris-versicolor': Y[i] = 1 elif Y[i] == 'Iris-virginica': Y[i] = 2 # 划分训练集与测试集 X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.3, random_state=10) #对数据集X与Y进行shape整理,让第一个参数为-1表示整理X成n行2列,整理Y成n行1列 X_train = np.vstack(X_train).reshape(-1, 4) Y_train = np.vstack(Y_train).reshape(-1, 1) X_test = np.vstack(X_test).reshape(-1, 4) Y_test = np.vstack(Y_test).reshape(-1, 1) ''' print(X_train) print(Y_train) print(X_test) print(Y_test) ''' #定义神经网络的输入,参数和输出,定义前向传播过程 def get_weight(shape): w = tf.Variable(tf.random_normal(shape), dtype=tf.float32) return w def get_bias(shape): b = tf.Variable(tf.constant(0.01, shape=shape)) return b x = tf.placeholder(tf.float32, shape=(None, 4)) yi = tf.placeholder(tf.float32, shape=(None, 1)) def BP_Model(): w1 = get_weight([4, 10]) # 第一个隐藏层,10个神经元,4个输入 b1 = get_bias([10]) y1 = tf.nn.softmax(tf.matmul(x, w1) + b1) # 注意维度 w2 = get_weight([10, 3]) # 输出层,3个神经元,10个输入 b2 = get_bias([3]) y = tf.nn.softmax(tf.matmul(y1, w2) + b2) return y def train(): # 生成计算图 y = BP_Model() # 定义损失函数 ce = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y, labels=tf.arg_max(yi, 1)) loss_cem = tf.reduce_mean(ce) # 定义反向传播方法,正则化 train_step = tf.train.AdamOptimizer(0.001).minimize(loss_cem) # 定义保存器 saver = tf.train.Saver(tf.global_variables()) #生成会话 with tf.Session() as sess: init_op = tf.global_variables_initializer() sess.run(init_op) Steps = 5000 for i in range(Steps): start = (i * BATCH_SIZE) % 300 end = start + BATCH_SIZE reslut = sess.run(train_step, feed_dict={x: X_train[start:end], yi: Y_train[start:end]}) if i % 100 == 0: loss_val = sess.run(loss_cem, feed_dict={x: X_train, yi: Y_train}) print("step: ", i, "loss: ", loss_val) print("保存模型: ", saver.save(sess, './model_iris/bp_model.model')) tf.summary.FileWriter("logs/", sess.graph) #train() def prediction(): # 生成计算图 y = BP_Model() # 定义损失函数 ce = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y, labels=tf.arg_max(yi, 1)) loss_cem = tf.reduce_mean(ce) # 定义保存器 saver = tf.train.Saver(tf.global_variables()) with tf.Session() as sess: saver.restore(sess, './model_iris/bp_model.model') result = sess.run(y, feed_dict={x: X_test}) loss_val = sess.run(loss_cem, feed_dict={x: X_test, yi: Y_test}) print("result :", result) print("loss :", loss_val) result_set = sess.run(tf.argmax(result, axis=1)) print("predict result: ", result_set) print("real result: ", Y_test.reshape(1, -1)) #prediction() ```

minst深度学习例程不收敛,成功率始终在十几

minst深度学习程序不收敛 是关于tensorflow的问题。我是tensorflow的初学者。从书上抄了minst的学习程序。但是运行之后,无论学习了多少批次,成功率基本不变。 我做了许多尝试,去掉了正则化,去掉了滑动平均,还是不行。把batch_size改成了2,观察变量运算情况,输入x是正确的,但神经网络的输出y很多情况下在x不一样的情况下y的两个结果是完全一样的。进而softmax的结果也是一样的。百思不得其解,找不到造成这种情况的原因。这里把代码和运行情况都贴出来,请大神帮我找找原因。大过年的,祝大家春节快乐万事如意。 补充一下,进一步的测试表明,不是不能完成训练,而是要到700000轮以上,且最高达到65%左右就不能提高了。仔细看每一步的参数,是regularization值过大10e15以上,一点点减少,前面的训练都在训练它了。这东西我不是很明白。 ``` import struct import numpy as np import matplotlib.pyplot as plt from matplotlib.widgets import Slider, Button import tensorflow as tf import time #把MNIST的操作封装在一个类中,以后用起来方便。 class MyMinst(): def decode_idx3_ubyte(self,idx3_ubyte_file): with open(idx3_ubyte_file, 'rb') as f: print('解析文件:', idx3_ubyte_file) fb_data = f.read() offset = 0 fmt_header = '>iiii' # 以大端法读取4个 unsinged int32 magic_number, num_images, num_rows, num_cols = struct.unpack_from(fmt_header, fb_data, offset) print('idex3 魔数:{},图片数:{}'.format(magic_number, num_images)) offset += struct.calcsize(fmt_header) fmt_image = '>' + str(num_rows * num_cols) + 'B' images = np.empty((num_images, num_rows*num_cols)) #做了修改 for i in range(num_images): im = struct.unpack_from(fmt_image, fb_data, offset) images[i] = np.array(im)#这里用一维数组表示图片,np.array(im).reshape((num_rows, num_cols)) offset += struct.calcsize(fmt_image) return images def decode_idx1_ubyte(self,idx1_ubyte_file): with open(idx1_ubyte_file, 'rb') as f: print('解析文件:', idx1_ubyte_file) fb_data = f.read() offset = 0 fmt_header = '>ii' # 以大端法读取两个 unsinged int32 magic_number, label_num = struct.unpack_from(fmt_header, fb_data, offset) print('idex1 魔数:{},标签数:{}'.format(magic_number, label_num)) offset += struct.calcsize(fmt_header) labels = np.empty(shape=[0,10],dtype=float) #神经网络需要把label变成10位float的数组 fmt_label = '>B' # 每次读取一个 byte for i in range(label_num): n=struct.unpack_from(fmt_label, fb_data, offset) labels=np.append(labels,[[0,0,0,0,0,0,0,0,0,0]],axis=0) labels[i][n]=1 offset += struct.calcsize(fmt_label) return labels def __init__(self): #固定的训练文件位置 self.img=self.decode_idx3_ubyte("/home/zhangyl/Downloads/mnist/train-images.idx3-ubyte") self.result=self.decode_idx1_ubyte("/home/zhangyl/Downloads/mnist/train-labels.idx1-ubyte") print(self.result[0]) print(self.result[1000]) print(self.result[25000]) #固定的验证文件位置 self.validate_img=self.decode_idx3_ubyte("/home/zhangyl/Downloads/mnist/t10k-images.idx3-ubyte") self.validate_result=self.decode_idx1_ubyte("/home/zhangyl/Downloads/mnist/t10k-labels.idx1-ubyte") #每一批读训练数据的起始位置 self.train_read_addr=0 #每一批读训练数据的batchsize self.train_batchsize=100 #每一批读验证数据的起始位置 self.validate_read_addr=0 #每一批读验证数据的batchsize self.validate_batchsize=100 #定义用于返回batch数据的变量 self.train_img_batch=self.img self.train_result_batch=self.result self.validate_img_batch=self.validate_img self.validate_result_batch=self.validate_result def get_next_batch_traindata(self): n=len(self.img) #对参数范围适当约束 if self.train_read_addr+self.train_batchsize<=n : self.train_img_batch=self.img[self.train_read_addr:self.train_read_addr+self.train_batchsize] self.train_result_batch=self.result[self.train_read_addr:self.train_read_addr+self.train_batchsize] self.train_read_addr+=self.train_batchsize #改变起始位置 if self.train_read_addr==n : self.train_read_addr=0 else: self.train_img_batch=self.img[self.train_read_addr:n] self.train_img_batch.append(self.img[0:self.train_read_addr+self.train_batchsize-n]) self.train_result_batch=self.result[self.train_read_addr:n] self.train_result_batch.append(self.result[0:self.train_read_addr+self.train_batchsize-n]) self.train_read_addr=self.train_read_addr+self.train_batchsize-n #改变起始位置,这里没考虑batchsize大于n的情形 return self.train_img_batch,self.train_result_batch #测试一下用临时变量返回是否可行 def set_train_read_addr(self,addr): self.train_read_addr=addr def set_train_batchsize(self,batchsize): self.train_batchsize=batchsize if batchsize <1 : self.train_batchsize=1 def set_validate_read_addr(self,addr): self.validate_read_addr=addr def set_validate_batchsize(self,batchsize): self.validate_batchsize=batchsize if batchsize<1 : self.validate_batchsize=1 myminst=MyMinst() #minst类的实例 batch_size=2 #设置每一轮训练的Batch大小 learning_rate=0.8 #初始学习率 learning_rate_decay=0.999 #学习率的衰减 max_steps=300000 #最大训练步数 #定义存储训练轮数的变量,在使用tensorflow训练神经网络时, #一般会将代表训练轮数的变量通过trainable参数设置为不可训练的 training_step = tf.Variable(0,trainable=False) #定义得到隐藏层和输出层的前向传播计算方式,激活函数使用relu() def hidden_layer(input_tensor,weights1,biases1,weights2,biases2,layer_name): layer1=tf.nn.relu(tf.matmul(input_tensor,weights1)+biases1) return tf.matmul(layer1,weights2)+biases2 x=tf.placeholder(tf.float32,[None,784],name="x-input") y_=tf.placeholder(tf.float32,[None,10],name="y-output") #生成隐藏层参数,其中weights包含784*500=39200个参数 weights1=tf.Variable(tf.truncated_normal([784,500],stddev=0.1)) biases1=tf.Variable(tf.constant(0.1,shape=[500])) #生成输出层参数,其中weights2包含500*10=5000个参数 weights2=tf.Variable(tf.truncated_normal([500,10],stddev=0.1)) biases2=tf.Variable(tf.constant(0.1,shape=[10])) #计算经过神经网络前后向传播后得到的y值 y=hidden_layer(x,weights1,biases1,weights2,biases2,'y') #初始化一个滑动平均类,衰减率为0.99 #为了使模型在训练前期可以更新的更快,这里提供了num_updates参数,并设置为当前网络的训练轮数 #averages_class=tf.train.ExponentialMovingAverage(0.99,training_step) #定义一个更新变量滑动平均值的操作需要向滑动平均类的apply()函数提供一个参数列表 #train_variables()函数返回集合图上Graph.TRAINABLE_VARIABLES中的元素。 #这个集合的元素就是所有没有指定trainable_variables=False的参数 #averages_op=averages_class.apply(tf.trainable_variables()) #再次计算经过神经网络前向传播后得到的y值,这里使用了滑动平均,但要牢记滑动平均值只是一个影子变量 #average_y=hidden_layer(x,averages_class.average(weights1), # averages_class.average(biases1), # averages_class.average(weights2), # averages_class.average(biases2), # 'average_y') #softmax,计算交叉熵损失,L2正则,随机梯度优化器,学习率采用指数衰减 #函数原型为sparse_softmax_cross_entropy_with_logits(_sential,labels,logdits,name) #与softmax_cross_entropy_with_logits()函数的计算方式相同,更适用于每个类别相互独立且排斥 #的情况,即每一幅图只能属于一类 #在1.0.0版本的TensorFlow中,这个函数只能通过命名参数的方式来使用,在这里logits参数是神经网 #络不包括softmax层的前向传播结果,lables参数给出了训练数据的正确答案 softmax=tf.nn.softmax(y) cross_entropy=tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y+1e-10,labels=tf.argmax(y_,1)) #argmax()函数原型为argmax(input,axis,name,dimension)用于计算每一个样例的预测答案,其中 # input参数y是一个batch_size*10(batch_size行,10列)的二维数组。每一行表示一个样例前向传 # 播的结果,axis参数“1”表示选取最大值的操作只在第一个维度进行。即只在每一行选取最大值对应的下标 # 于是得到的结果是一个长度为batch_size的一维数组,这个一维数组的值就表示了每一个样例的数字识别 # 结果。 regularizer=tf.contrib.layers.l2_regularizer(0.0001) #计算L2正则化损失函数 regularization=regularizer(weights1)+regularizer(weights2) #计算模型的正则化损失 loss=tf.reduce_mean(cross_entropy)#+regularization #总损失 #用指数衰减法设置学习率,这里staircase参数采用默认的False,即学习率连续衰减 learning_rate=tf.train.exponential_decay(learning_rate,training_step, batch_size,learning_rate_decay) #使用GradientDescentOptimizer优化算法来优化交叉熵损失和正则化损失 train_op=tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=training_step) #在训练这个模型时,每过一遍数据既需要通过反向传播来更新神经网络中的参数,又需要 # 更新每一个参数的滑动平均值。control_dependencies()用于这样的一次性多次操作 #同样的操作也可以使用下面这行代码完成: #train_op=tf.group(train_step,average_op) #with tf.control_dependencies([train_step,averages_op]): # train_op=tf.no_op(name="train") #检查使用了滑动平均模型的神经网络前向传播结果是否正确 #equal()函数原型为equal(x,y,name),用于判断两个张量的每一维是否相等。 #如果相等返回True,否则返回False crorent_predicition=tf.equal(tf.argmax(y,1),tf.argmax(y_,1)) #cast()函数的原型为cast(x,DstT,name),在这里用于将一个布尔型的数据转换为float32类型 #之后对得到的float32型数据求平均值,这个平均值就是模型在这一组数据上的正确率 accuracy=tf.reduce_mean(tf.cast(crorent_predicition,tf.float32)) #创建会话和开始训练过程 with tf.Session() as sess: #在稍早的版本中一般使用initialize_all_variables()函数初始化全部变量 tf.global_variables_initializer().run() #准备验证数据 validate_feed={x:myminst.validate_img,y_:myminst.validate_result} #准备测试数据 test_feed= {x:myminst.img,y_:myminst.result} for i in range(max_steps): if i%1000==0: #计算滑动平均模型在验证数据上的结果 #为了能得到百分数输出,需要将得到的validate_accuracy扩大100倍 validate_accuracy= sess.run(accuracy,feed_dict=validate_feed) print("After %d trainning steps,validation accuracy using average model is %g%%" %(i,validate_accuracy*100)) #产生这一轮使用一个batch的训练数据,并进行训练 #input_data.read_data_sets()函数生成的类提供了train.next_batch()函数 #通过设置函数的batch_size参数就可以从所有的训练数据中读取一个小部分作为一个训练batch myminst.set_train_batchsize(batch_size) xs,ys=myminst.get_next_batch_traindata() var_print=sess.run([x,y,y_,loss,train_op,softmax,cross_entropy,regularization,weights1],feed_dict={x:xs,y_:ys}) print("after ",i," trainning steps:") print("x=",var_print[0][0],var_print[0][1],"y=",var_print[1],"y_=",var_print[2],"loss=",var_print[3], "softmax=",var_print[5],"cross_entropy=",var_print[6],"regularization=",var_print[7],var_print[7]) time.sleep(0.5) #使用测试数据集检验神经网络训练之后的正确率 #为了能得到百分数输出,需要将得到的test_accuracy扩大100倍 test_accuracy=sess.run(accuracy,feed_dict=test_feed) print("After %d training steps,test accuracy using average model is %g%%"%(max_steps,test_accuracy*100)) 下面是运行情况的一部分: x= [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 8. 76. 202. 254. 255. 163. 37. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 13. 182. 253. 253. 253. 253. 253. 253. 23. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 15. 179. 253. 253. 212. 91. 218. 253. 253. 179. 109. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 105. 253. 253. 160. 35. 156. 253. 253. 253. 253. 250. 113. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 19. 212. 253. 253. 88. 121. 253. 233. 128. 91. 245. 253. 248. 114. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 104. 253. 253. 110. 2. 142. 253. 90. 0. 0. 26. 199. 253. 248. 63. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 173. 253. 253. 29. 0. 84. 228. 39. 0. 0. 0. 72. 251. 253. 215. 29. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 36. 253. 253. 203. 13. 0. 0. 0. 0. 0. 0. 0. 0. 82. 253. 253. 170. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 36. 253. 253. 164. 0. 0. 0. 0. 0. 0. 0. 0. 0. 11. 198. 253. 184. 6. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 36. 253. 253. 82. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 138. 253. 253. 35. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 128. 253. 253. 47. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 48. 253. 253. 35. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 154. 253. 253. 47. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 48. 253. 253. 35. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 102. 253. 253. 99. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 48. 253. 253. 35. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 36. 253. 253. 164. 0. 0. 0. 0. 0. 0. 0. 0. 0. 16. 208. 253. 211. 17. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 32. 244. 253. 175. 4. 0. 0. 0. 0. 0. 0. 0. 0. 44. 253. 253. 156. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 171. 253. 253. 29. 0. 0. 0. 0. 0. 0. 0. 30. 217. 253. 188. 19. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 171. 253. 253. 59. 0. 0. 0. 0. 0. 0. 60. 217. 253. 253. 70. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 78. 253. 253. 231. 48. 0. 0. 0. 26. 128. 249. 253. 244. 94. 15. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 8. 151. 253. 253. 234. 101. 121. 219. 229. 253. 253. 201. 80. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 38. 232. 253. 253. 253. 253. 253. 253. 253. 201. 66. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 232. 253. 253. 95. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 3. 86. 46. 0. 0. 0. 0. 0. 0. 91. 246. 252. 232. 57. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 103. 252. 187. 13. 0. 0. 0. 0. 22. 219. 252. 252. 175. 0. 0. 0. 0. 0. 0. 0. 0. 0. 10. 0. 0. 0. 0. 8. 181. 252. 246. 30. 0. 0. 0. 0. 65. 252. 237. 197. 64. 0. 0. 0. 0. 0. 0. 0. 0. 0. 87. 0. 0. 0. 13. 172. 252. 252. 104. 0. 0. 0. 0. 5. 184. 252. 67. 103. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 8. 172. 252. 248. 145. 14. 0. 0. 0. 0. 109. 252. 183. 137. 64. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 5. 224. 252. 248. 134. 0. 0. 0. 0. 0. 53. 238. 252. 245. 86. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 12. 174. 252. 223. 88. 0. 0. 0. 0. 0. 0. 209. 252. 252. 179. 9. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 11. 171. 252. 246. 61. 0. 0. 0. 0. 0. 0. 83. 241. 252. 211. 14. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 129. 252. 252. 249. 220. 220. 215. 111. 192. 220. 221. 243. 252. 252. 149. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 144. 253. 253. 253. 253. 253. 253. 253. 253. 253. 255. 253. 226. 153. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 44. 77. 77. 77. 77. 77. 77. 77. 77. 153. 253. 235. 32. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 74. 214. 240. 114. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 24. 221. 243. 57. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 8. 180. 252. 119. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 136. 252. 153. 7. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 3. 136. 251. 226. 34. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 123. 252. 246. 39. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 165. 252. 127. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 165. 175. 3. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] y= [[ 0.58273095 0.50121385 -0.74845004 0.35842288 -0.13741069 -0.5839622 0.2642774 0.5101677 -0.29416046 0.5471707 ] [ 0.58273095 0.50121385 -0.74845004 0.35842288 -0.13741069 -0.5839622 0.2642774 0.5101677 -0.29416046 0.5471707 ]] y_= [[1. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]] loss= 2.2801425 softmax= [[0.14659645 0.13512042 0.03872566 0.11714067 0.07134604 0.04564939 0.10661562 0.13633572 0.06099501 0.14147504] [0.14659645 0.13512042 0.03872566 0.11714067 0.07134604 0.04564939 0.10661562 0.13633572 0.06099501 0.14147504]] cross_entropy= [1.9200717 2.6402135] regularization= 50459690000000.0 50459690000000.0 after 45 trainning steps: x= [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 25. 214. 225. 90. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 7. 145. 212. 253. 253. 60. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 106. 253. 253. 246. 188. 23. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 45. 164. 254. 253. 223. 108. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 24. 236. 253. 252. 124. 28. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 100. 217. 253. 218. 116. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 158. 175. 225. 253. 92. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 24. 217. 241. 248. 114. 2. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 21. 201. 253. 253. 114. 3. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 107. 253. 253. 213. 19. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 170. 254. 254. 169. 0. 0. 0. 0. 0. 2. 13. 100. 133. 89. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 18. 210. 253. 253. 100. 0. 0. 0. 19. 76. 116. 253. 253. 253. 176. 4. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 41. 222. 253. 208. 18. 0. 0. 93. 209. 232. 217. 224. 253. 253. 241. 31. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 157. 253. 253. 229. 32. 0. 154. 250. 246. 36. 0. 49. 253. 253. 168. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 128. 253. 253. 253. 195. 125. 247. 166. 69. 0. 0. 37. 236. 253. 168. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 37. 253. 253. 253. 253. 253. 135. 32. 0. 7. 130. 73. 202. 253. 133. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 7. 185. 253. 253. 253. 253. 64. 0. 10. 210. 253. 253. 253. 153. 9. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 66. 253. 253. 253. 253. 238. 218. 221. 253. 253. 235. 156. 37. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 5. 111. 228. 253. 253. 253. 253. 254. 253. 168. 19. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 9. 110. 178. 253. 253. 249. 63. 5. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 121. 121. 240. 253. 218. 121. 121. 44. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 17. 107. 184. 240. 253. 252. 252. 252. 252. 252. 252. 219. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 75. 122. 230. 252. 252. 252. 253. 252. 252. 252. 252. 252. 252. 239. 56. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 77. 129. 213. 244. 252. 252. 252. 252. 252. 253. 252. 252. 209. 252. 252. 252. 225. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 240. 252. 252. 252. 252. 252. 252. 213. 185. 53. 53. 53. 89. 252. 252. 252. 120. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 240. 232. 198. 93. 164. 108. 66. 28. 0. 0. 0. 0. 81. 252. 252. 222. 24. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 76. 50. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 171. 252. 243. 108. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 144. 238. 252. 115. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 7. 70. 241. 248. 133. 28. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 121. 252. 252. 172. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 64. 255. 253. 209. 21. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 13. 246. 253. 207. 21. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 10. 172. 252. 209. 92. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 13. 168. 252. 252. 92. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 43. 208. 252. 241. 53. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 15. 166. 252. 204. 62. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 13. 166. 243. 191. 29. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 10. 168. 231. 177. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 6. 172. 241. 50. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 177. 202. 19. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] y= [[ 0.8592988 0.3954708 -0.77875614 0.26675048 0.19804694 -0.61968666 0.18084174 0.4034736 -0.34189415 0.43645462] [ 0.8592988 0.3954708 -0.77875614 0.26675048 0.19804694 -0.61968666 0.18084174 0.4034736 -0.34189415 0.43645462]] y_= [[0. 0. 0. 0. 0. 0. 1. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]] loss= 2.2191708 softmax= [[0.19166051 0.12052987 0.0372507 0.10597225 0.09893605 0.04367344 0.09724841 0.12149832 0.05765821 0.12557226] [0.19166051 0.12052987 0.0372507 0.10597225 0.09893605 0.04367344 0.09724841 0.12149832 0.05765821 0.12557226]] cross_entropy= [2.3304868 2.1078548] regularization= 50459690000000.0 50459690000000.0 after 46 trainning steps: x= [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 196. 99. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 5. 49. 0. 0. 0. 0. 0. 0. 34. 244. 98. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 89. 135. 0. 0. 0. 0. 0. 0. 40. 253. 98. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 171. 150. 0. 0. 0. 0. 0. 0. 40. 253. 98. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 254. 233. 0. 0. 0. 0. 0. 0. 77. 253. 98. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 255. 136. 0. 0. 0. 0. 0. 0. 77. 254. 99. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 254. 135. 0. 0. 0. 0. 0. 0. 123. 253. 98. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 254. 135. 0. 0. 0. 0. 0. 0. 136. 253. 98. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 16. 254. 135. 0. 0. 0. 0. 0. 0. 136. 237. 8. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 98. 254. 135. 0. 0. 38. 99. 98. 98. 219. 155. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 196. 255. 208. 186. 254. 254. 255. 254. 254. 254. 254. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 105. 254. 253. 239. 180. 135. 39. 39. 39. 237. 170. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 137. 92. 24. 0. 0. 0. 0. 0. 234. 155. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 13. 237. 155. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 79. 253. 155. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 31. 242. 155. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 61. 248. 155. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 234. 155. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 234. 155. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 196. 155. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 50. 236. 255. 124. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 53. 231. 253. 253. 107. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 9. 193. 253. 253. 230. 4. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 7. 156. 253. 253. 149. 36. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 24. 253. 253. 190. 8. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 3. 175. 253. 253. 72. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 123. 253. 253. 138. 3. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 10. 244. 253. 230. 34. 0. 9. 24. 23. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 181. 253. 249. 123. 0. 69. 195. 253. 249. 146. 15. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 21. 231. 253. 202. 0. 70. 236. 253. 253. 253. 253. 170. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 22. 139. 253. 213. 26. 13. 200. 253. 253. 183. 252. 253. 220. 22. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 72. 253. 253. 129. 0. 86. 253. 253. 129. 4. 105. 253. 253. 70. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 72. 253. 253. 77. 22. 245. 253. 183. 4. 0. 2. 105. 253. 70. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 132. 253. 253. 11. 24. 253. 253. 116. 0. 0. 1. 150. 253. 70. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 189. 253. 241. 10. 24. 253. 253. 59. 0. 0. 82. 253. 212. 30. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 189. 253. 147. 0. 24. 253. 253. 150. 30. 44. 208. 212. 31. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 189. 253. 174. 3. 7. 185. 253. 253. 227. 247. 184. 30. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 150. 253. 253. 145. 95. 234. 253. 253. 253. 126. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 72. 253. 253. 253. 253. 253. 253. 253. 169. 14. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 5. 114. 240. 253. 253. 234. 135. 44. 3. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] y= [[ 0.7093834 0.30119324 -0.80789334 0.1838598 0.12065991 -0.6538477 0.49587095 0.6995347 -0.38699397 0.33823296] [ 0.7093834 0.30119324 -0.80789334 0.1838598 0.12065991 -0.6538477 0.49587095 0.6995347 -0.38699397 0.33823296]] y_= [[0. 0. 0. 0. 1. 0. 0. 0. 0. 0.] [0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]] loss= 2.2107558 softmax= [[0.16371341 0.10884525 0.03590371 0.09679484 0.09086671 0.04188326 0.1322382 0.16210894 0.05469323 0.11295244] [0.16371341 0.10884525 0.03590371 0.09679484 0.09086671 0.04188326 0.1322382 0.16210894 0.05469323 0.11295244]] cross_entropy= [2.3983614 2.0231504] regularization= 50459690000000.0 50459690000000.0 after 47 trainning steps: x= [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 11. 139. 212. 253. 159. 86. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 34. 89. 203. 253. 252. 252. 252. 252. 74. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 49. 184. 234. 252. 252. 184. 110. 100. 208. 252. 199. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 95. 233. 252. 252. 176. 56. 0. 0. 0. 17. 234. 249. 75. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 220. 253. 178. 54. 4. 0. 0. 0. 0. 43. 240. 243. 50. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 221. 255. 180. 55. 5. 0. 0. 0. 7. 160. 253. 168. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 116. 253. 252. 252. 67. 0. 0. 0. 91. 252. 231. 42. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 32. 190. 252. 252. 185. 38. 0. 119. 234. 252. 54. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 15. 177. 252. 252. 179. 155. 236. 227. 119. 4. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 26. 221. 252. 252. 253. 252. 130. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 32. 229. 253. 255. 144. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 66. 236. 252. 253. 92. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 66. 234. 252. 252. 253. 92. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 19. 236. 252. 252. 252. 253. 92. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 53. 181. 252. 168. 43. 232. 253. 92. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 179. 255. 218. 32. 93. 253. 252. 84. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 81. 244. 239. 33. 0. 114. 252. 209. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 207. 252. 237. 70. 153. 240. 252. 32. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 207. 252. 253. 252. 252. 252. 210. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 61. 242. 253. 252. 168. 96. 12. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 68. 254. 255. 254. 107. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 11. 176. 230. 253. 253. 253. 212. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 28. 197. 253. 253. 253. 253. 253. 229. 107. 14. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 194. 253. 253. 253. 253. 253. 253. 253. 253. 53. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 69. 241. 253. 253. 253. 253. 241. 186. 253. 253. 195. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 10. 161. 253. 253. 253. 246. 40. 57. 231. 253. 253. 195. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 140. 253. 253. 253. 253. 154. 0. 25. 253. 253. 253. 195. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 213. 253. 253. 253. 135. 8. 0. 3. 128. 253. 253. 195. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 77. 238. 253. 253. 253. 7. 0. 0. 0. 116. 253. 253. 195. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 11. 165. 253. 253. 231. 70. 1. 0. 0. 0. 78. 237. 253. 195. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 33. 253. 253. 253. 182. 0. 0. 0. 0. 0. 0. 200. 253. 195. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 98. 253. 253. 253. 24. 0. 0. 0. 0. 0. 0. 42. 253. 195. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 197. 253. 253. 253. 24. 0. 0. 0. 0. 0. 0. 163. 253. 195. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 197. 253. 253. 189. 13. 0. 0. 0. 0. 0. 53. 227. 253. 121. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 197. 253. 253. 114. 0. 0. 0. 0. 0. 21. 227. 253. 231. 27. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 197. 253. 253. 114. 0. 0. 0. 5. 131. 143. 253. 231. 59. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 197. 253. 253. 236. 73. 58. 217. 223. 253. 253. 253. 174. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 197. 253. 253. 253. 253. 253. 253. 253. 253. 253. 253. 48. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 149. 253. 253. 253. 253. 253. 253. 253. 253. 182. 15. 3. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 12. 168. 253. 253. 253. 253. 253. 248. 89. 23. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] y= [[ 0.5813921 0.21609789 -0.8359629 0.10818548 0.44052082 -0.6865921 0.78338754 0.5727978 -0.4297532 0.24992661] [ 0.5813921 0.21609789 -0.8359629 0.10818548 0.44052082 -0.6865921 0.78338754 0.5727978 -0.4297532 0.24992661]] y_= [[0. 0. 0. 0. 0. 0. 0. 0. 1. 0.] [1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]] loss= 2.452383 softmax= [[0.14272858 0.09905256 0.03459087 0.08892009 0.1239742 0.04016358 0.1746773 0.14150718 0.05192496 0.10246069] [0.14272858 0.09905256 0.03459087 0.08892009 0.1239742 0.04016358 0.1746773 0.14150718 0.05192496 0.10246069]] cross_entropy= [2.9579558 1.9468105] regularization= 50459690000000.0 50459690000000.0 已终止 ```

用tensorflow做机器翻译时训练代码有问题

``` # -*- coding:UTF-8 -*- import tensorflow as tf src_path = 'D:/Python37/untitled1/train.tags.en-zh.en.deletehtml' trg_path = 'D:/Python37/untitled1/train.tags.en-zh.zh.deletehtml' SRC_TRAIN_DATA = 'D:/Python37/untitled1/train.tags.en-zh.en.deletehtml.segment' # 源语言输入文件 TRG_TRAIN_DATA = 'D:/Python37/untitled1/train.tags.en-zh.zh.deletehtml.segment' # 目标语言输入文件 CHECKPOINT_PATH = './model/seq2seq_ckpt' # checkpoint保存路径 HIDDEN_SIZE = 1024 # LSTM的隐藏层规模 NUM_LAYERS = 2 # 深层循环神经网络中LSTM结构的层数 SRC_VOCAB_SIZE = 10000 # 源语言词汇表大小 TRG_VOCAB_SIZE = 4000 # 目标语言词汇表大小 BATCH_SIZE = 100 # 训练数据batch的大小 NUM_EPOCH = 5 # 使用训练数据的轮数 KEEP_PROB = 0.8 # 节点不被dropout的概率 MAX_GRAD_NORM = 5 # 用于控制梯度膨胀的梯度大小上限 SHARE_EMB_AND_SOFTMAX = True # 在softmax层和词向量层之间共享参数 MAX_LEN = 50 # 限定句子的最大单词数量 SOS_ID = 1 # 目标语言词汇表中<sos>的ID """ function: 数据batching,产生最后输入数据格式 Parameters: file_path-数据路径 Returns: dataset- 每个句子-对应的长度组成的TextLineDataset类的数据集对应的张量 """ def MakeDataset(file_path): dataset = tf.data.TextLineDataset(file_path) # map(function, sequence[, sequence, ...]) -> list # 通过定义可以看到,这个函数的第一个参数是一个函数,剩下的参数是一个或多个序列,返回值是一个集合。 # function可以理解为是一个一对一或多对一函数,map的作用是以参数序列中的每一个元素调用function函数,返回包含每次function函数返回值的list。 # lambda argument_list: expression # 其中lambda是Python预留的关键字,argument_list和expression由用户自定义 # argument_list参数列表, expression 为函数表达式 # 根据空格将单词编号切分开并放入一个一维向量 dataset = dataset.map(lambda string: tf.string_split([string]).values) # 将字符串形式的单词编号转化为整数 dataset = dataset.map(lambda string: tf.string_to_number(string, tf.int32)) # 统计每个句子的单词数量,并与句子内容一起放入Dataset dataset = dataset.map(lambda x: (x, tf.size(x))) return dataset """ function: 从源语言文件src_path和目标语言文件trg_path中分别读取数据,并进行填充和batching操作 Parameters: src_path-源语言,即被翻译的语言,英语. trg_path-目标语言,翻译之后的语言,汉语. batch_size-batch的大小 Returns: dataset- 每个句子-对应的长度 组成的TextLineDataset类的数据集 """ def MakeSrcTrgDataset(src_path, trg_path, batch_size): # 首先分别读取源语言数据和目标语言数据 src_data = MakeDataset(src_path) trg_data = MakeDataset(trg_path) # 通过zip操作将两个Dataset合并为一个Dataset,现在每个Dataset中每一项数据ds由4个张量组成 # ds[0][0]是源句子 # ds[0][1]是源句子长度 # ds[1][0]是目标句子 # ds[1][1]是目标句子长度 #https://blog.csdn.net/qq_32458499/article/details/78856530这篇博客看一下可以细致了解一下Dataset这个库,以及.map和.zip的用法 dataset = tf.data.Dataset.zip((src_data, trg_data)) # 删除内容为空(只包含<eos>)的句子和长度过长的句子 def FilterLength(src_tuple, trg_tuple): ((src_input, src_len), (trg_label, trg_len)) = (src_tuple, trg_tuple) # tf.logical_and 相当于集合中的and做法,后面两个都为true最终结果才会为true,否则为false # tf.greater Returns the truth value of (x > y),所以以下所说的是句子长度必须得大于一也就是不能为空的句子 # tf.less_equal Returns the truth value of (x <= y),所以所说的是长度要小于最长长度 src_len_ok = tf.logical_and(tf.greater(src_len, 1), tf.less_equal(src_len, MAX_LEN)) trg_len_ok = tf.logical_and(tf.greater(trg_len, 1), tf.less_equal(trg_len, MAX_LEN)) return tf.logical_and(src_len_ok, trg_len_ok) #两个都满足才返回true # filter接收一个函数Func并将该函数作用于dataset的每个元素,根据返回值True或False保留或丢弃该元素,True保留该元素,False丢弃该元素 # 最后得到的就是去掉空句子和过长的句子的数据集 dataset = dataset.filter(FilterLength) # 解码器需要两种格式的目标句子: # 1.解码器的输入(trg_input), 形式如同'<sos> X Y Z' # 2.解码器的目标输出(trg_label), 形式如同'X Y Z <eos>' # 上面从文件中读到的目标句子是'X Y Z <eos>'的形式,我们需要从中生成'<sos> X Y Z'形式并加入到Dataset # 编码器只有输入,没有输出,而解码器有输入也有输出,输入为<sos>+(除去最后一位eos的label列表) # 例如train.en最后都为2,id为2就是eos def MakeTrgInput(src_tuple, trg_tuple): ((src_input, src_len), (trg_label, trg_len)) = (src_tuple, trg_tuple) # tf.concat用法 https://blog.csdn.net/qq_33431368/article/details/79429295 trg_input = tf.concat([[SOS_ID], trg_label[:-1]], axis=0) return ((src_input, src_len), (trg_input, trg_label, trg_len)) dataset = dataset.map(MakeTrgInput) # 随机打乱训练数据 dataset = dataset.shuffle(10000) # 规定填充后的输出的数据维度 padded_shapes = ( (tf.TensorShape([None]), # 源句子是长度未知的向量 tf.TensorShape([])), # 源句子长度是单个数字 (tf.TensorShape([None]), # 目标句子(解码器输入)是长度未知的向量 tf.TensorShape([None]), # 目标句子(解码器目标输出)是长度未知的向量 tf.TensorShape([])) # 目标句子长度(输出)是单个数字 ) # 调用padded_batch方法进行padding 和 batching操作 batched_dataset = dataset.padded_batch(batch_size, padded_shapes) return batched_dataset """ function: seq2seq模型 Parameters: Returns: """ class NMTModel(object): """ function: 模型初始化 Parameters: Returns: """ def __init__(self): # 定义编码器和解码器所使用的LSTM结构 self.enc_cell = tf.nn.rnn_cell.MultiRNNCell( [tf.nn.rnn_cell.LSTMCell(HIDDEN_SIZE) for _ in range(NUM_LAYERS)]) self.dec_cell = tf.nn.rnn_cell.MultiRNNCell( [tf.nn.rnn_cell.LSTMCell(HIDDEN_SIZE) for _ in range(NUM_LAYERS)]) # 为源语言和目标语言分别定义词向量 self.src_embedding = tf.get_variable('src_emb', [SRC_VOCAB_SIZE, HIDDEN_SIZE]) self.trg_embedding = tf.get_variable('trg_emb', [TRG_VOCAB_SIZE, HIDDEN_SIZE]) # 定义softmax层的变量 if SHARE_EMB_AND_SOFTMAX: self.softmax_weight = tf.transpose(self.trg_embedding) else: self.softmax_weight = tf.get_variable('weight', [HIDDEN_SIZE, TRG_VOCAB_SIZE]) self.softmax_bias = tf.get_variable('softmax_loss', [TRG_VOCAB_SIZE]) """ function: 在forward函数中定义模型的前向计算图 Parameters:   MakeSrcTrgDataset函数产生的五种张量如下(全部为张量) src_input: 编码器输入(源数据) src_size : 输入大小 trg_input:解码器输入(目标数据) trg_label:解码器输出(目标数据) trg_size: 输出大小 Returns: """ def forward(self, src_input, src_size, trg_input, trg_label, trg_size): batch_size = tf.shape(src_input)[0] # 将输入和输出单词转为词向量(rnn中输入数据都要转换成词向量) # 相当于input中的每个id对应的embedding中的向量转换 src_emb = tf.nn.embedding_lookup(self.src_embedding, src_input) trg_emb = tf.nn.embedding_lookup(self.trg_embedding, trg_input) # 在词向量上进行dropout src_emb = tf.nn.dropout(src_emb, KEEP_PROB) trg_emb = tf.nn.dropout(trg_emb, KEEP_PROB) # 使用dynamic_rnn构造编码器 # 编码器读取源句子每个位置的词向量,输出最后一步的隐藏状态enc_state # 因为编码器是一个双层LSTM,因此enc_state是一个包含两个LSTMStateTuple类的tuple, # 每个LSTMStateTuple对应编码器中一层的状态 # enc_outputs是顶层LSTM在每一步的输出,它的维度是[batch_size, max_time, HIDDEN_SIZE] # seq2seq模型中不需要用到enc_outputs,而attention模型会用到它 with tf.variable_scope('encoder'): enc_outputs, enc_state = tf.nn.dynamic_rnn(self.enc_cell, src_emb, src_size, dtype=tf.float32) # 使用dynamic_rnn构造解码器 # 解码器读取目标句子每个位置的词向量,输出的dec_outputs为每一步顶层LSTM的输出 # dec_outputs的维度是[batch_size, max_time, HIDDEN_SIZE] # initial_state=enc_state表示用编码器的输出来初始化第一步的隐藏状态 # 编码器最后编码结束最后的状态为解码器初始化的状态 with tf.variable_scope('decoder'): dec_outputs, _ = tf.nn.dynamic_rnn(self.dec_cell, trg_emb, trg_size, initial_state=enc_state) # 计算解码器每一步的log perplexity # 输出重新转换成shape为[,HIDDEN_SIZE] output = tf.reshape(dec_outputs, [-1, HIDDEN_SIZE]) # 计算解码器每一步的softmax概率值 logits = tf.matmul(output, self.softmax_weight) + self.softmax_bias # 交叉熵损失函数,算loss loss = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=tf.reshape(trg_label, [-1]), logits=logits) # 在计算平均损失时,需要将填充位置的权重设置为0,以避免无效位置的预测干扰模型的训练 label_weights = tf.sequence_mask(trg_size, maxlen=tf.shape(trg_label)[1], dtype=tf.float32) label_weights = tf.reshape(label_weights, [-1]) cost = tf.reduce_sum(loss * label_weights) cost_per_token = cost / tf.reduce_sum(label_weights) # 定义反向传播操作 trainable_variables = tf.trainable_variables() # 控制梯度大小,定义优化方法和训练步骤 # 算出每个需要更新的值的梯度,并对其进行控制 grads = tf.gradients(cost / tf.to_float(batch_size), trainable_variables) grads, _ = tf.clip_by_global_norm(grads, MAX_GRAD_NORM) # 利用梯度下降优化算法进行优化.学习率为1.0 optimizer = tf.train.GradientDescentOptimizer(learning_rate=1.0) # 相当于minimize的第二步,正常来讲所得到的list[grads,vars]由compute_gradients得到,返回的是执行对应变量的更新梯度操作的op train_op = optimizer.apply_gradients(zip(grads, trainable_variables)) return cost_per_token, train_op """ function: 使用给定的模型model上训练一个epoch,并返回全局步数,每训练200步便保存一个checkpoint Parameters: session : 会议 cost_op : 计算loss的操作op train_op: 训练的操作op saver:  保存model的类 step:   训练步数 Returns: """ def run_epoch(session, cost_op, train_op, saver, step): # 训练一个epoch # 重复训练步骤直至遍历完Dataset中所有数据 while True: try: # 运行train_op并计算cost_op的结果也就是损失值,训练数据在main()函数中以Dataset方式提供 cost, _ = session.run([cost_op, train_op]) # 步数为10的倍数进行打印 if step % 10 == 0: print('After %d steps, per token cost is %.3f' % (step, cost)) # 每200步保存一个checkpoint if step % 200 == 0: saver.save(session, CHECKPOINT_PATH, global_step=step) step += 1 except tf.errors.OutOfRangeError: break return step """ function: 主函数 Parameters: Returns: """ def main(): # 定义初始化函数 initializer = tf.random_uniform_initializer(-0.05, 0.05) # 定义训练用的循环神经网络模型 with tf.variable_scope('nmt_model', reuse=None, initializer=initializer): train_model = NMTModel() # 定义输入数据 data = MakeSrcTrgDataset(SRC_TRAIN_DATA, TRG_TRAIN_DATA, BATCH_SIZE) iterator = data.make_initializable_iterator() (src, src_size), (trg_input, trg_label, trg_size) = iterator.get_next() # 定义前向计算图,输入数据以张量形式提供给forward函数 cost_op, train_op = train_model.forward(src, src_size, trg_input, trg_label, trg_size) # 训练模型 # 保存模型 saver = tf.train.Saver() step = 0 with tf.Session() as sess: # 初始化全部变量 tf.global_variables_initializer().run() # 进行NUM_EPOCH轮数 for i in range(NUM_EPOCH): print('In iteration: %d' % (i + 1)) sess.run(iterator.initializer) step = run_epoch(sess, cost_op, train_op, saver, step) if __name__ == '__main__': main() ``` 问题如下,不知道怎么解决,谢谢! Traceback (most recent call last): File "D:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1334, in _do_call return fn(*args) File "D:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1319, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "D:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1407, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InvalidArgumentError: StringToNumberOp could not correctly convert string: This [[{{node StringToNumber}}]] [[{{node IteratorGetNext}}]] During handling of the above exception, another exception occurred: Traceback (most recent call last): File "D:/Python37/untitled1/train_model.py", line 277, in <module> main() File "D:/Python37/untitled1/train_model.py", line 273, in main step = run_epoch(sess, cost_op, train_op, saver, step) File "D:/Python37/untitled1/train_model.py", line 231, in run_epoch cost, _ = session.run([cost_op, train_op]) File "D:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 929, in run run_metadata_ptr) File "D:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1152, in _run feed_dict_tensor, options, run_metadata) File "D:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1328, in _do_run run_metadata) File "D:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1348, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: StringToNumberOp could not correctly convert string: This [[{{node StringToNumber}}]] [[node IteratorGetNext (defined at D:/Python37/untitled1/train_model.py:259) ]]

Tensorflow 多GPU并行训练,模型收敛su'du'ma

在使用多GPU并行训练深度学习神经网络时,以TFrecords 形式读取MNIST数据的训练数据进行训练,发现比直接用MNIST训练数据训练相同模型时,发现前者收敛速度慢和运行时间长,已知模型没有问题,想请大神帮忙看看是什么原因导致运行速度慢,运行时间长 ``` import os import time import numpy as np import tensorflow as tf from datetime import datetime import tensorflow.compat.v1 as v1 from tensorflow.examples.tutorials.mnist import input_data BATCH_SIZE = 100 LEARNING_RATE = 1e-4 LEARNING_RATE_DECAY = 0.99 REGULARZTION_RATE = 1e-4 EPOCHS = 10000 MOVING_AVERAGE_DECAY = 0.99 N_GPU = 2 MODEL_SAVE_PATH = r'F:\model\log_dir' MODEL_NAME = 'model.ckpt' TRAIN_PATH = r'F:\model\threads_file\MNIST_data_tfrecords\train.tfrecords' TEST_PATH = r'F:\model\threads_file\MNIST_data_tfrecords\test.tfrecords' def __int64_feature(value): return v1.train.Feature(int64_list=v1.train.Int64List(value=[value])) def __bytes_feature(value): return v1.train.Feature(bytes_list=v1.train.BytesList(value=[value])) def creat_tfrecords(path, data, labels): writer = tf.io.TFRecordWriter(path) for i in range(len(data)): image = data[i].tostring() label = labels[i] examples = v1.train.Example(features=v1.train.Features(feature={ 'image': __bytes_feature(image), 'label': __int64_feature(label) })) writer.write(examples.SerializeToString()) writer.close() def parser(record): features = v1.parse_single_example(record, features={ 'image': v1.FixedLenFeature([], tf.string), 'label': v1.FixedLenFeature([], tf.int64) }) image = tf.decode_raw(features['image'], tf.uint8) image = tf.reshape(image, [28, 28, 1]) image = tf.cast(image, tf.float32) label = tf.cast(features['label'], tf.int32) label = tf.one_hot(label, 10, on_value=1, off_value=0) return image, label def get_input(batch_size, path): dataset = tf.data.TFRecordDataset([path]) dataset = dataset.map(parser) dataset = dataset.shuffle(10000) dataset = dataset.repeat(100) dataset = dataset.batch(batch_size) iterator = dataset.make_one_shot_iterator() image, label = iterator.get_next() return image, label def model_inference(images, labels, rate, regularzer=None, reuse_variables=None): with v1.variable_scope(v1.get_variable_scope(), reuse=reuse_variables): with tf.compat.v1.variable_scope('First_conv'): w1 = tf.compat.v1.get_variable('weights', [3, 3, 1, 32], tf.float32, initializer=tf.compat.v1.truncated_normal_initializer(stddev=0.1)) if regularzer: tf.add_to_collection('losses', regularzer(w1)) b1 = tf.compat.v1.get_variable('biases', [32], tf.float32, initializer=tf.compat.v1.constant_initializer(0.1)) activation1 = tf.nn.relu(tf.nn.conv2d(images, w1, strides=[1, 1, 1, 1], padding='SAME') + b1) out1 = tf.nn.max_pool2d(activation1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') with tf.compat.v1.variable_scope('Second_conv'): w2 = tf.compat.v1.get_variable('weight', [3, 3, 32, 64], tf.float32, initializer=tf.compat.v1.truncated_normal_initializer(stddev=0.1)) if regularzer: tf.add_to_collection('losses', regularzer(w2)) b2 = tf.compat.v1.get_variable('biases', [64], tf.float32, initializer=tf.compat.v1.constant_initializer(0.1)) activation2 = tf.nn.relu(tf.nn.conv2d(out1, w2, strides=[1, 1, 1, 1], padding='SAME') + b2) out2 = tf.nn.max_pool2d(activation2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') out3 = tf.reshape(out2, [-1, 7*7*64], name='flatten') with tf.compat.v1.variable_scope('FC_1'): w3 = tf.compat.v1.get_variable('weight', [7*7*64, 1024], tf.float32, initializer=tf.compat.v1.truncated_normal_initializer(stddev=0.1)) if regularzer: tf.add_to_collection('losses', regularzer(w3)) b3 = tf.compat.v1.get_variable('biases', [1024], tf.float32, initializer=tf.compat.v1.constant_initializer(0.1)) activation3 = tf.nn.relu(tf.matmul(out3, w3) + b3) out4 = tf.nn.dropout(activation3, keep_prob=rate) with tf.compat.v1.variable_scope('FC_2'): w4 = tf.compat.v1.get_variable('weight', [1024, 10], tf.float32, initializer=tf.compat.v1.truncated_normal_initializer(stddev=0.1)) if regularzer: tf.add_to_collection('losses', regularzer(w4)) b4 = tf.compat.v1.get_variable('biases', [10], tf.float32, initializer=tf.compat.v1.constant_initializer(0.1)) output = tf.nn.softmax(tf.matmul(out4, w4) + b4) with tf.compat.v1.variable_scope('Loss_entropy'): if regularzer: loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=tf.argmax(labels, 1), logits=output)) \ + tf.add_n(tf.get_collection('losses')) else: loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=tf.argmax(labels, 1), logits=output)) with tf.compat.v1.variable_scope('Accuracy'): correct_data = tf.equal(tf.math.argmax(labels, 1), tf.math.argmax(output, 1)) accuracy = tf.reduce_mean(tf.cast(correct_data, tf.float32, name='accuracy')) return output, loss, accuracy def average_gradients(tower_grads): average_grads = [] for grad_and_vars in zip(*tower_grads): grads = [] for g, v2 in grad_and_vars: expanded_g = tf.expand_dims(g, 0) grads.append(expanded_g) grad = tf.concat(grads, 0) grad = tf.reduce_mean(grad, 0) v = grad_and_vars[0][1] grad_and_var = (grad, v) average_grads.append(grad_and_var) return average_grads def main(argv=None): with tf.Graph().as_default(), tf.device('/cpu:0'): x, y = get_input(batch_size=BATCH_SIZE, path=TRAIN_PATH) regularizer = tf.contrib.layers.l2_regularizer(REGULARZTION_RATE) global_step = v1.get_variable('global_step', [], initializer=v1.constant_initializer(0), trainable=False) lr = v1.train.exponential_decay(LEARNING_RATE, global_step, 55000/BATCH_SIZE, LEARNING_RATE_DECAY) opt = v1.train.AdamOptimizer(lr) tower_grads = [] reuse_variables = False device = ['/gpu:0', '/cpu:0'] for i in range(len(device)): with tf.device(device[i]): with v1.name_scope(device[i][1:4] + '_0') as scope: out, cur_loss, acc = model_inference(x, y, 0.3, regularizer, reuse_variables) reuse_variables = True grads = opt.compute_gradients(cur_loss) tower_grads.append(grads) grads = average_gradients(tower_grads) for grad, var in grads: if grad is not None: v1.summary.histogram('gradients_on_average/%s' % var.op.name, grad) apply_gradient_op = opt.apply_gradients(grads, global_step) for var in v1.trainable_variables(): tf.summary.histogram(var.op.name, var) variable_averages = v1.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY, global_step) variable_to_average = (v1.trainable_variables() + v1.moving_average_variables()) variable_averages_op = variable_averages.apply(variable_to_average) train_op = tf.group(apply_gradient_op, variable_averages_op) saver = v1.train.Saver(max_to_keep=1) summary_op = v1.summary.merge_all() # merge_all 可以将所有summary全部保存到磁盘 init = v1.global_variables_initializer() with v1.Session(config=v1.ConfigProto(allow_soft_placement=True, log_device_placement=True)) as sess: init.run() summary_writer = v1.summary.FileWriter(MODEL_SAVE_PATH, sess.graph) # 指定一个文件用来保存图 for step in range(EPOCHS): try: start_time = time.time() _, loss_value, out_value, acc_value = sess.run([train_op, cur_loss, out, acc]) duration = time.time() - start_time if step != 0 and step % 100 == 0: num_examples_per_step = BATCH_SIZE * N_GPU examples_per_sec = num_examples_per_step / duration sec_per_batch = duration / N_GPU format_str = '%s: step %d, loss = %.2f(%.1f examples/sec; %.3f sec/batch), accuracy = %.2f' print(format_str % (datetime.now(), step, loss_value, examples_per_sec, sec_per_batch, acc_value)) summary = sess.run(summary_op) summary_writer.add_summary(summary, step) if step % 100 == 0 or (step + 1) == EPOCHS: checkpoint_path = os.path.join(MODEL_SAVE_PATH, MODEL_NAME) saver.save(sess, checkpoint_path, global_step=step) except tf.errors.OutOfRangeError: break if __name__ == '__main__': tf.app.run() ```

使用tensorflow的API dataset遇到memoryerror

使用Tensorflow的API dataset的时候遇到了memoryerror,可是我是使用官方推荐的占位符的方法啊,我的系统是ubuntu 18.0.4,tensorflow 的版本是1.13.1,Python3.6,先上代码: ``` def main(_): if FLAGS.self_test: train_data, train_labels = fake_data(256) validation_data, validation_labels = fake_data(EVAL_BATCH_SIZE) test_data, test_labels = fake_data(EVAL_BATCH_SIZE) num_epochs = 1 else: stft_training, mfcc_training, labels_training = joblib.load(open(FLAGS.input, mode='rb')) stft_training = numpy.array(stft_training) mfcc_training = numpy.array(mfcc_training) labels_training = numpy.array(labels_training) stft_shape = stft_training.shape stft_shape = (None, stft_shape[1], stft_shape[2]) mfcc_shape = mfcc_training.shape mfcc_shape = (None, mfcc_shape[1], mfcc_shape[2]) labels_shape = labels_training.shape labels_shape = (None) stft_placeholder = tf.placeholder(stft_training.dtype, stft_shape) labels_placeholder = tf.placeholder(labels_training.dtype, labels_shape) mfcc_placeholder = tf.placeholder(mfcc_training.dtype, mfcc_shape) dataset_training = tf.data.Dataset.from_tensor_slices((stft_placeholder, mfcc_placeholder, labels_placeholder)) dataset_training = dataset_training .apply( tf.data.experimental.shuffle_and_repeat(len(stft_training), None)) dataset_training = dataset_training .batch(BATCH_SIZE) dataset_training = dataset_training .prefetch(1) iterator_training = dataset_training.make_initializable_iterator() next_element_training = iterator_training.get_next() num_epochs = NUM_EPOCHS train_size = labels_training.shape[0] stft = tf.placeholder( data_type(), shape=(BATCH_SIZE, IMAGE_HEIGHT, IMAGE_WEITH, NUM_CHANNELS)) mfcc = tf.placeholder( data_type(), shape=(BATCH_SIZE, IMAGE_HEIGHT, IMAGE_WEITH, NUM_CHANNELS)) labels = tf.placeholder(tf.int64, shape=(BATCH_SIZE,)) model = BRN(stft, mfcc) config = tf.ConfigProto() config.gpu_options.allow_growth = True with tf.Session(config=config) as sess: tf.global_variables_initializer().run() train_writer = tf.summary.FileWriter(log_dir + 'train', sess.graph) converter = tf.lite.TFLiteConverter.from_session(sess, [stft,mfcc], [logits]) tflite_model = converter.convert() open("BRN.tflite", "wb").write(tflite_model) print('Initialized!') sess.run(iterator_training.initializer, feed_dict={stft_placeholder:stft_training, mfcc_placeholder:stft_training, labels_placeholder:stft_training}) ``` 报错信息: ![图片说明](https://img-ask.csdn.net/upload/201907/21/1563699144_423650.png)

学Python后到底能干什么?网友:我太难了

感觉全世界营销文都在推Python,但是找不到工作的话,又有哪个机构会站出来给我推荐工作? 笔者冷静分析多方数据,想跟大家说:关于超越老牌霸主Java,过去几年间Python一直都被寄予厚望。但是事实是虽然上升趋势,但是国内环境下,一时间是无法马上就超越Java的,也可以换句话说:超越Java只是时间问题罢。 太嚣张了会Python的人!找工作拿高薪这么简单? https://edu....

大学四年自学走来,这些私藏的实用工具/学习网站我贡献出来了

大学四年,看课本是不可能一直看课本的了,对于学习,特别是自学,善于搜索网上的一些资源来辅助,还是非常有必要的,下面我就把这几年私藏的各种资源,网站贡献出来给你们。主要有:电子书搜索、实用工具、在线视频学习网站、非视频学习网站、软件下载、面试/求职必备网站。 注意:文中提到的所有资源,文末我都给你整理好了,你们只管拿去,如果觉得不错,转发、分享就是最大的支持了。 一、电子书搜索 对于大部分程序员...

在中国程序员是青春饭吗?

今年,我也32了 ,为了不给大家误导,咨询了猎头、圈内好友,以及年过35岁的几位老程序员……舍了老脸去揭人家伤疤……希望能给大家以帮助,记得帮我点赞哦。 目录: 你以为的人生 一次又一次的伤害 猎头界的真相 如何应对互联网行业的「中年危机」 一、你以为的人生 刚入行时,拿着傲人的工资,想着好好干,以为我们的人生是这样的: 等真到了那一天,你会发现,你的人生很可能是这样的: ...

Java校招入职华为,半年后我跑路了

何来 我,一个双非本科弟弟,有幸在 19 届的秋招中得到前东家华为(以下简称 hw)的赏识,当时秋招签订就业协议,说是入了某 java bg,之后一系列组织架构调整原因等等让人无法理解的神操作,最终毕业前夕,被通知调往其他 bg 做嵌入式开发(纯 C 语言)。 由于已至于校招末尾,之前拿到的其他 offer 又无法再收回,一时感到无力回天,只得默默接受。 毕业后,直接入职开始了嵌入式苦旅,由于从未...

Java基础知识面试题(2020最新版)

文章目录Java概述何为编程什么是Javajdk1.5之后的三大版本JVM、JRE和JDK的关系什么是跨平台性?原理是什么Java语言有哪些特点什么是字节码?采用字节码的最大好处是什么什么是Java程序的主类?应用程序和小程序的主类有何不同?Java应用程序与小程序之间有那些差别?Java和C++的区别Oracle JDK 和 OpenJDK 的对比基础语法数据类型Java有哪些数据类型switc...

@程序员:GitHub这个项目快薅羊毛

今天下午在朋友圈看到很多人都在发github的羊毛,一时没明白是怎么回事。 后来上百度搜索了一下,原来真有这回事,毕竟资源主义的羊毛不少啊,1000刀刷爆了朋友圈!不知道你们的朋友圈有没有看到类似的消息。 这到底是啥情况? 微软开发者平台GitHub 的一个区块链项目 Handshake ,搞了一个招募新会员的活动,面向GitHub 上前 25万名开发者派送 4,246.99 HNS币,大约价...

用python打开电脑摄像头,并把图像传回qq邮箱【Pyinstaller打包】

前言: 如何悄悄的打开朋友的摄像头,看看她最近过的怎么样,嘿嘿!这次让我带你们来实现这个功能。 注: 这个程序仅限在朋友之间开玩笑,别去搞什么违法的事情哦。 代码 发送邮件 使用python内置的email模块即可完成。导入相应的代码封装为一个send函数,顺便导入需要导入的包 注: 下面的代码有三处要修改的地方,两处写的qq邮箱地址,还有一处写的qq邮箱授权码,不知道qq邮箱授权码的可以去百度一...

做了5年运维,靠着这份监控知识体系,我从3K变成了40K

从来没讲过运维,因为我觉得运维这种东西不需要太多的知识面,然后我一个做了运维朋友告诉我大错特错,他就是从3K的运维一步步到40K的,甚至笑着说:我现在感觉自己什么都能做。 既然讲,就讲最重要的吧。 监控是整个运维乃至整个产品生命周期中最重要的一环,事前及时预警发现故障,事后提供详实的数据用于追查定位问题。目前业界有很多不错的开源产品可供选择。选择一款开源的监控系统,是一个省时省力、效率最高的方...

计算机网络——浅析网络层

一、前言 注意,关于ipv4和ipv6,ipv4是ip协议第4版本,也表示这个版本的ip一共4个字节,同样地,ipv6是ip协议第6版本,也表示这个版本的ip一共6个字节。 关于网络层使用路由器实现互联:在计算机网络的分层结构中,不同层有不同的中继设备: 计算机网络层 中继设备/中继系统 物理层 中继器、集线器Hub 数据链路层 网桥或交换机(交换机是多端口网桥,两者本质上是一个东西) 网络层 路...

我以为我学懂了数据结构,直到看了这个导图才发现,我错了

数据结构与算法思维导图

技术大佬:我去,你写的 switch 语句也太老土了吧

昨天早上通过远程的方式 review 了两名新来同事的代码,大部分代码都写得很漂亮,严谨的同时注释也很到位,这令我非常满意。但当我看到他们当中有一个人写的 switch 语句时,还是忍不住破口大骂:“我擦,小王,你丫写的 switch 语句也太老土了吧!” 来看看小王写的代码吧,看完不要骂我装逼啊。 private static String createPlayer(PlayerTypes p...

华为初面+综合面试(Java技术面)附上面试题

华为面试整体流程大致分为笔试,性格测试,面试,综合面试,回学校等结果。笔试来说,华为的难度较中等,选择题难度和网易腾讯差不多。最后的代码题,相比下来就简单很多,一共3道题目,前2题很容易就AC,题目已经记不太清楚,不过难度确实不大。最后一题最后提交的代码过了75%的样例,一直没有发现剩下的25%可能存在什么坑。 笔试部分太久远,我就不怎么回忆了。直接将面试。 面试 如果说腾讯的面试是挥金如土...

和黑客斗争的 6 天!

互联网公司工作,很难避免不和黑客们打交道,我呆过的两家互联网公司,几乎每月每天每分钟都有黑客在公司网站上扫描。有的是寻找 Sql 注入的缺口,有的是寻找线上服务器可能存在的漏洞,大部分都...

讲一个程序员如何副业月赚三万的真实故事

loonggg读完需要3分钟速读仅需 1 分钟大家好,我是你们的校长。我之前讲过,这年头,只要肯动脑,肯行动,程序员凭借自己的技术,赚钱的方式还是有很多种的。仅仅靠在公司出卖自己的劳动时...

win10暴力查看wifi密码

刚才邻居打了个电话说:喂小灰,你家wifi的密码是多少,我怎么连不上了。 我。。。 我也忘了哎,就找到了一个好办法,分享给大家: 第一种情况:已经连接上的wifi,怎么知道密码? 打开:控制面板\网络和 Internet\网络连接 然后右击wifi连接的无线网卡,选择状态 然后像下图一样: 第二种情况:前提是我不知道啊,但是我以前知道密码。 此时可以利用dos命令了 1、利用netsh wlan...

上班一个月,后悔当初着急入职的选择了

最近有个老铁,告诉我说,上班一个月,后悔当初着急入职现在公司了。他之前在美图做手机研发,今年美图那边今年也有一波组织优化调整,他是其中一个,在协商离职后,当时捉急找工作上班,因为有房贷供着,不能没有收入来源。所以匆忙选了一家公司,实际上是一个大型外包公司,主要派遣给其他手机厂商做外包项目。**当时承诺待遇还不错,所以就立马入职去上班了。但是后面入职后,发现薪酬待遇这块并不是HR所说那样,那个HR自...

女程序员,为什么比男程序员少???

昨天看到一档综艺节目,讨论了两个话题:(1)中国学生的数学成绩,平均下来看,会比国外好?为什么?(2)男生的数学成绩,平均下来看,会比女生好?为什么?同时,我又联想到了一个技术圈经常讨...

副业收入是我做程序媛的3倍,工作外的B面人生是怎样的?

提到“程序员”,多数人脑海里首先想到的大约是:为人木讷、薪水超高、工作枯燥…… 然而,当离开工作岗位,撕去层层标签,脱下“程序员”这身外套,有的人生动又有趣,马上展现出了完全不同的A/B面人生! 不论是简单的爱好,还是正经的副业,他们都干得同样出色。偶尔,还能和程序员的特质结合,产生奇妙的“化学反应”。 @Charlotte:平日素颜示人,周末美妆博主 大家都以为程序媛也个个不修边幅,但我们也许...

MySQL数据库面试题(2020最新版)

文章目录数据库基础知识为什么要使用数据库什么是SQL?什么是MySQL?数据库三大范式是什么mysql有关权限的表都有哪几个MySQL的binlog有有几种录入格式?分别有什么区别?数据类型mysql有哪些数据类型引擎MySQL存储引擎MyISAM与InnoDB区别MyISAM索引与InnoDB索引的区别?InnoDB引擎的4大特性存储引擎选择索引什么是索引?索引有哪些优缺点?索引使用场景(重点)...

记一次腾讯面试,我挂在了最熟悉不过的队列上……

腾讯后台面试,面试官问:如何自己实现队列?

如果你是老板,你会不会踢了这样的员工?

有个好朋友ZS,是技术总监,昨天问我:“有一个老下属,跟了我很多年,做事勤勤恳恳,主动性也很好。但随着公司的发展,他的进步速度,跟不上团队的步伐了,有点...

离职半年了,老东家又发 offer,回不回?

有小伙伴问松哥这个问题,他在上海某公司,在离职了几个月后,前公司的领导联系到他,希望他能够返聘回去,他很纠结要不要回去? 俗话说好马不吃回头草,但是这个小伙伴既然感到纠结了,我觉得至少说明了两个问题:1.曾经的公司还不错;2.现在的日子也不是很如意。否则应该就不会纠结了。 老实说,松哥之前也有过类似的经历,今天就来和小伙伴们聊聊回头草到底吃不吃。 首先一个基本观点,就是离职了也没必要和老东家弄的苦...

2020阿里全球数学大赛:3万名高手、4道题、2天2夜未交卷

阿里巴巴全球数学竞赛( Alibaba Global Mathematics Competition)由马云发起,由中国科学技术协会、阿里巴巴基金会、阿里巴巴达摩院共同举办。大赛不设报名门槛,全世界爱好数学的人都可参与,不论是否出身数学专业、是否投身数学研究。 2020年阿里巴巴达摩院邀请北京大学、剑桥大学、浙江大学等高校的顶尖数学教师组建了出题组。中科院院士、美国艺术与科学院院士、北京国际数学...

HTTP与HTTPS的区别

面试官问HTTP与HTTPS的区别,我这样回答让他竖起大拇指!

男生更看重女生的身材脸蛋,还是思想?

往往,我们看不进去大段大段的逻辑。深刻的哲理,往往短而精悍,一阵见血。问:产品经理挺漂亮的,有点心动,但不知道合不合得来。男生更看重女生的身材脸蛋,还是...

程序员为什么千万不要瞎努力?

本文作者用对比非常鲜明的两个开发团队的故事,讲解了敏捷开发之道 —— 如果你的团队缺乏统一标准的环境,那么即使勤劳努力,不仅会极其耗时而且成果甚微,使用...

为什么程序员做外包会被瞧不起?

二哥,有个事想询问下您的意见,您觉得应届生值得去外包吗?公司虽然挺大的,中xx,但待遇感觉挺低,马上要报到,挺纠结的。

当HR压你价,说你只值7K,你该怎么回答?

当HR压你价,说你只值7K时,你可以流畅地回答,记住,是流畅,不能犹豫。 礼貌地说:“7K是吗?了解了。嗯~其实我对贵司的面试官印象很好。只不过,现在我的手头上已经有一份11K的offer。来面试,主要也是自己对贵司挺有兴趣的,所以过来看看……”(未完) 这段话主要是陪HR互诈的同时,从公司兴趣,公司职员印象上,都给予对方正面的肯定,既能提升HR的好感度,又能让谈判气氛融洽,为后面的发挥留足空间。...

面试:第十六章:Java中级开发(16k)

HashMap底层实现原理,红黑树,B+树,B树的结构原理 Spring的AOP和IOC是什么?它们常见的使用场景有哪些?Spring事务,事务的属性,传播行为,数据库隔离级别 Spring和SpringMVC,MyBatis以及SpringBoot的注解分别有哪些?SpringMVC的工作原理,SpringBoot框架的优点,MyBatis框架的优点 SpringCould组件有哪些,他们...

面试阿里p7,被按在地上摩擦,鬼知道我经历了什么?

面试阿里p7被问到的问题(当时我只知道第一个):@Conditional是做什么的?@Conditional多个条件是什么逻辑关系?条件判断在什么时候执...

立即提问
相关内容推荐