关于keras 对模型进行训练 train_on_batch参数和模型输出的关系

在用keras+gym测试policy gradient进行小车杆平衡时模型搭建如下:

        inputs = Input(shape=(4,),name='ob_inputs')
        x = Dense(16,activation='relu')(inputs)
        x = Dense(16,activation='relu')(x)
        x = Dense(1,activation='sigmoid')(x)
        model = Model(inputs=inputs,outputs = x)

这里输出层是一个神经元,输出一个[0,1]之间的数,表示小车动作的概率
但是在代码训练过程中,模型的训练代码为:

                X = np.array(states)
                y = np.array(list(zip(actions,discount_rewards)))
                loss = self.model.train_on_batch(X,y)

这里的target data(y)是一个2维的列表数组,第一列是对应执行的动作,第二列是折扣奖励,那么在训练的时候,神经网络的输出数据和target data的维度不一致,是如何计算loss的呢?会自动去拟合y的第一列数据吗?

1个回答

Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
其他相关推荐
关于keras molel.train_on_batch()返回的loss问题

比如我将一个batch5个批次共100个二分类样本输入模型,使用交叉熵作为损失函数,optimizer是Adam 那么请问调用train_on_batch()是对每个样本计算loss然后更新模型(更新100次)然后返回最后一次更新之后的loss值吗? 或者是将100个样本的loss值求和之后更新模型呢?这种更新方法返回的loss值又是什么呢?

keras model.fit_generator训练完一个epoch之后无法加载训练集怎么处理?

1、在训练神经网络的过程中遇到了训练完一个epoch之后无法继续训练的问题,具体问题截图如下 ![图片说明](https://img-ask.csdn.net/upload/202002/08/1581151633_972155.png) 数据生成的代码如下 ``` def GET_DATASET_SHUFFLE(train_x, train_y, batch_size): #random.shuffle(X_samples) batch_num = int(len(train_x) / batch_size) max_len = batch_num * batch_size X_samples = np.array(train_x[0:max_len]) Y_samples = np.array(train_y[0:max_len]) X_batches = np.split(X_samples, batch_num) Y_batches = np.split(Y_samples, batch_num) for i in range(batch_num): x = np.array(list(map(load_image, X_batches[i]))) y = np.array(list(map(load_label, Y_batches[i]))) yield x, y ``` 想要向各位大神请教一下,刚刚接触这个不是太懂

keras利用callback获取的每个batch的acc数据精度不足

我想利用callback收集训练过程中每个batch的acc数据 但按batch收集的acc只有小数点后两位,按epoch收集的acc数据与就保留了小数点后很多位,按batch和epoch收集的loss数据都保留了小数点后很多位 代码如下 ``` class LossHistory(callbacks.Callback): def on_train_begin(self, logs={}): self.losses = {'batch': [], 'epoch': []} self.accuracy = {'batch': [], 'epoch': []} self.val_loss = {'batch': [], 'epoch': []} self.val_acc = {'batch': [], 'epoch': []} def on_batch_end(self, batch, logs={}): self.losses['batch'].append(logs.get('loss')) self.accuracy['batch'].append(logs.get('acc')) self.val_loss['batch'].append(logs.get('val_loss')) self.val_acc['batch'].append(logs.get('val_acc')) def on_epoch_end(self, batch, logs={}): self.losses['epoch'].append(logs.get('loss')) self.accuracy['epoch'].append(logs.get('acc')) self.val_loss['epoch'].append(logs.get('val_loss')) self.val_acc['epoch'].append(logs.get('val_acc')) def loss_plot(self, loss_type): iters = range(len(self.losses[loss_type])) plt.figure() # acc plt.plot(iters, self.accuracy[loss_type], 'r', label='train acc') # loss plt.plot(iters, self.losses[loss_type], 'g', label='train loss') if loss_type == 'epoch': # val_acc plt.plot(iters, self.val_acc[loss_type], 'b', label='val acc') # val_loss plt.plot(iters, self.val_loss[loss_type], 'k', label='val loss') plt.grid(True) plt.xlabel(loss_type) plt.ylabel('acc-loss') plt.legend(loc="upper right") plt.show() class Csr: def __init__(self,voc): self.model = Sequential() #B*L self.model.add(Embedding(voc.num_words, 300, mask_zero = True, weights = [voc.index2emb], trainable = False)) #B*L*256 self.model.add(GRU(256)) #B*256 self.model.add(Dropout(0.5)) self.model.add(Dense(1, activation='sigmoid')) #B*1 self.model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy']) print('compole complete') def train(self, x_train, y_train, b_s=50, epo=10): print('training.....') history = LossHistory() his = self.model.fit(x_train, y_train, batch_size=b_s, epochs=epo, callbacks=[history]) history.loss_plot('batch') print('training complete') return his, history ``` 程序运行结果如下: ![图片说明](https://img-ask.csdn.net/upload/201905/14/1557803291_621582.png) ![图片说明](https://img-ask.csdn.net/upload/201905/14/1557803304_240896.png)

为什么同样的问题用Tensorflow和keras实现结果不一样?

**cifar-10分类问题,同样的模型结构以及损失函数还有学习率参数等超参数,分别用TensorFlow和keras实现。 20个epochs后在测试集上进行预测,准确率总是差好几个百分点,不知道问题出在哪里?代码如下: 这个是TF的代码:** import tensorflow as tf import numpy as np import pickle as pk tf.reset_default_graph() batch_size = 64 test_size = 10000 img_size = 32 num_classes = 10 training_epochs = 10 test_size=200 ############################################################################### def unpickle(filename): '''解压数据''' with open(filename, 'rb') as f: d = pk.load(f, encoding='latin1') return d def onehot(labels): '''one-hot 编码''' n_sample = len(labels) n_class = max(labels) + 1 onehot_labels = np.zeros((n_sample, n_class)) onehot_labels[np.arange(n_sample), labels] = 1 return onehot_labels # 训练数据集 data1 = unpickle('data_batch_1') data2 = unpickle('data_batch_2') data3 = unpickle('data_batch_3') data4 = unpickle('data_batch_4') data5 = unpickle('data_batch_5') X_train = np.concatenate((data1['data'], data2['data'], data3['data'], data4['data'], data5['data']), axis=0)/255.0 y_train = np.concatenate((data1['labels'], data2['labels'], data3['labels'], data4['labels'], data5['labels']), axis=0) y_train = onehot(y_train) # 测试数据集 test = unpickle('test_batch') X_test = test['data']/255.0 y_test = onehot(test['labels']) del test,data1,data2,data3,data4,data5 ############################################################################### w = tf.Variable(tf.random_normal([5, 5, 3, 32], stddev=0.01)) w_c= tf.Variable(tf.random_normal([32* 16* 16, 512], stddev=0.1)) w_o =tf.Variable(tf.random_normal([512, num_classes], stddev=0.1)) def init_bias(shape): return tf.Variable(tf.constant(0.0, shape=shape)) b=init_bias([32]) b_c=init_bias([512]) b_o=init_bias([10]) def model(X, w, w_c,w_o, p_keep_conv, p_keep_hidden,b,b_c,b_o): conv1 = tf.nn.conv2d(X, w,strides=[1, 1, 1, 1],padding='SAME')#32x32x32 conv1=tf.nn.bias_add(conv1,b) conv1 = tf.nn.relu(conv1) conv1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1],strides=[1, 2, 2, 1],padding='SAME')#16x16x32 conv1 = tf.nn.dropout(conv1, p_keep_conv) FC_layer = tf.reshape(conv1, [-1, 32 * 16 * 16]) out_layer=tf.matmul(FC_layer, w_c)+b_c out_layer=tf.nn.relu(out_layer) out_layer = tf.nn.dropout(out_layer, p_keep_hidden) result = tf.matmul(out_layer, w_o)+b_o return result trX, trY, teX, teY = X_train,y_train,X_test,y_test trX = trX.reshape(-1, img_size, img_size, 3) teX = teX.reshape(-1, img_size, img_size, 3) X = tf.placeholder("float", [None, img_size, img_size, 3]) Y = tf.placeholder("float", [None, num_classes]) p_keep_conv = tf.placeholder("float") p_keep_hidden = tf.placeholder("float") py_x = model(X, w, w_c,w_o, p_keep_conv, p_keep_hidden,b,b_c,b_o) Y_ = tf.nn.softmax_cross_entropy_with_logits_v2(logits=py_x, labels=Y) cost = tf.reduce_mean(Y_) optimizer = tf.train.RMSPropOptimizer(0.001, 0.9).minimize(cost) predict_op = tf.argmax(py_x, 1) with tf.Session() as sess: tf.global_variables_initializer().run() for i in range(training_epochs): training_batch = zip(range(0, len(trX),batch_size),range(batch_size, len(trX)+1,batch_size)) perm=np.arange(len(trX)) np.random.shuffle(perm) trX=trX[perm] trY=trY[perm] for start, end in training_batch: sess.run(optimizer, feed_dict={X: trX[start:end],Y: trY[start:end],p_keep_conv:0.75,p_keep_hidden: 0.5}) test_batch = zip(range(0, len(teX),test_size),range(test_size, len(teX)+1,test_size)) accuracyResult=0 for start, end in test_batch: accuracyResult=accuracyResult+sum(np.argmax(teY[start:end], axis=1) ==sess.run(predict_op, feed_dict={X: teX[start:end],Y: teY[start:end],p_keep_conv: 1,p_keep_hidden: 1})) print(i, accuracyResult/10000) **这个是keras代码:** from keras import initializers from keras.datasets import cifar10 from keras.utils import np_utils from keras.models import Sequential from keras.layers.core import Dense, Dropout, Activation, Flatten from keras.layers.convolutional import Conv2D, MaxPooling2D from keras.optimizers import SGD, Adam, RMSprop #import matplotlib.pyplot as plt # CIFAR_10 is a set of 60K images 32x32 pixels on 3 channels IMG_CHANNELS = 3 IMG_ROWS = 32 IMG_COLS = 32 #constant BATCH_SIZE = 64 NB_EPOCH = 10 NB_CLASSES = 10 VERBOSE = 1 VALIDATION_SPLIT = 0 OPTIM = RMSprop() #load dataset (X_train, y_train), (X_test, y_test) = cifar10.load_data() #print('X_train shape:', X_train.shape) #print(X_train.shape[0], 'train samples') #print(X_test.shape[0], 'test samples') # convert to categorical Y_train = np_utils.to_categorical(y_train, NB_CLASSES) Y_test = np_utils.to_categorical(y_test, NB_CLASSES) # float and normalization X_train = X_train.astype('float32') X_test = X_test.astype('float32') X_train /= 255 X_test /= 255 # network model = Sequential() model.add(Conv2D(32, (3, 3), padding='same',input_shape=(IMG_ROWS, IMG_COLS, IMG_CHANNELS),kernel_initializer=initializers.random_normal(stddev=0.01),bias_initializer=initializers.Zeros())) model.add(Activation('relu')) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) #0<参数<1才会有用 model.add(Flatten()) model.add(Dense(512,kernel_initializer=initializers.random_normal(stddev=0.1),bias_initializer=initializers.Zeros())) model.add(Activation('relu')) model.add(Dropout(0.5)) model.add(Dense(NB_CLASSES,kernel_initializer=initializers.random_normal(stddev=0.1),bias_initializer=initializers.Zeros())) model.add(Activation('softmax')) model.summary() # train model.compile(loss='categorical_crossentropy', optimizer=OPTIM,metrics=['accuracy']) model.fit(X_train, Y_train, batch_size=BATCH_SIZE,epochs=NB_EPOCH, validation_split=VALIDATION_SPLIT,verbose=VERBOSE) score = model.evaluate(X_test, Y_test,batch_size=200, verbose=VERBOSE) print("Test score:", score[0]) print('Test accuracy:', score[1])

keras薛定谔的训练结果问题

刚刚开始学习keras,今天在测试非线性函数拟合的时候发现即便用了‘relu’激活函数还是没有办法很好的拟合结果,这已经困扰我很久了,而且更奇怪的是有一句看起来和结果毫无关系的语句居然会直接改变结果的分布 就是这一句: ``` print(y_pred) ``` 没有加的时候的结果: ![图片说明](https://img-ask.csdn.net/upload/202004/24/1587719740_46631.jpg) 加了之后的结果: ![图片说明](https://img-ask.csdn.net/upload/202004/24/1587719761_631438.jpg) 或者 ![图片说明](https://img-ask.csdn.net/upload/202004/24/1587719776_946600.jpg) 代码如下: ``` import keras import numpy as np import matplotlib.pyplot as plt #按顺序构成的模型 from keras.models import Sequential #全连接层 from keras.layers import Dense,Activation from keras.optimizers import SGD #使用numpy生成随机数据 x_data = np.linspace(-0.5,0.5,200) noise = np.random.normal(0,0.02,x_data.shape) y_data = np.square(x_data) + noise #显示随机点 plt.scatter(x_data,y_data) plt.show() # 构建一个顺序模型 model = Sequential() # 在模型中添加一个全连接层 model.add(Dense(units=10,input_dim=1,activation='relu')) # model.add(Activation("relu"))不行? #model.add(Activation("relu")) model.add(Dense(units=1,activation='relu')) # model.add(Activation("relu"))不行 #model.add(Activation("relu")) # 定义优化算法 sgd = SGD(lr=0.3) model.compile(optimizer=sgd,loss="mse") for step in range(3000): cost = model.train_on_batch(x_data,y_data) if step%500==0: print("cost: ",cost) W,b = model.layers[0].get_weights() print("W: ",W,"b: ",b) # x_data输入网络中,得到预测值 y_pred = model.predict(x_data) # 加不加这一句会对结果造成直接影响 print(y_pred) plt.scatter(x_data,y_pred) plt.plot(x_data,y_pred,"r-",lw=3) plt.show() ```

模型分别在mac和windows服务器上跑,准确率相差60%多!

这是一个用cnn做文本分类的一个模型,我在自己的mac上跑准确率有90%,但是放到windows服务器上准确率竟然只有25%,不知道是什么原因? from __future__ import print_function import numpy as np from keras.utils import np_utils from keras.preprocessing.text import Tokenizer from keras.preprocessing.sequence import pad_sequences import pandas as pd import os from keras import backend as K print('Loading Dict') embeddings_index = {} f = open(os.path.join( 'glove.6B.100d.txt')) for line in f: values = line.split() word = values[0] coefs = np.asarray(values[1:], dtype='float32') embeddings_index[word] = coefs f.close() print('Loading dataset') tmp=pd.read_csv('train.csv') train_X=np.array(tmp.iloc[:,2]).astype('str') train_y=np.array(tmp.iloc[:,0]).astype('int16') train_y_ohe = np_utils.to_categorical(train_y) del tmp tmp=pd.read_csv('test.csv') test_X=np.array(tmp.iloc[:,2]).astype('str') test_y=np.array(tmp.iloc[:,0]).astype('int16') test_y_ohe = np_utils.to_categorical(test_y) del tmp train_y_ohe=train_y_ohe.astype('float32') test_y_ohe=test_y_ohe.astype('float32') X=np.append(train_X,test_X) print('Tokening') t = Tokenizer() t.fit_on_texts(X) vocab_size = len(t.word_index) + 1 # integer encode the documents encoded_X = t.texts_to_sequences(X) # pad documents to a max length of x words max_length = 50 padded_X = pad_sequences(encoded_X, maxlen=max_length, padding='post') embedding_matrix = np.zeros((vocab_size, 100)).astype('float32') for word, i in t.word_index.items(): embedding_vector = embeddings_index.get(word) if embedding_vector is not None: embedding_matrix[i] = embedding_vector padded_X_train=pad_sequences(encoded_X[0:119999],maxlen=max_length, padding='post') padded_X_test=pad_sequences(encoded_X[119999:127598],maxlen=max_length, padding='post') padded_X_test=padded_X_test.astype('float32') padded_X_train=padded_X_train.astype('float32') print('Estabilish model') from keras.models import Model from keras.layers import Dense,Embedding,Convolution1D,concatenate,Flatten,Input,MaxPooling1D,Dropout,Merge from keras.callbacks import TensorBoard K.clear_session() x=Input(shape=(50,),dtype='float32') embed=Embedding(input_dim=vocab_size,output_dim=100,weights=[embedding_matrix],input_length=max_length)(x) cnn1=Convolution1D(128,9,activation='relu',padding='same',strides=1)(embed) cnn1=MaxPooling1D(5)(cnn1) cnn2=Convolution1D(128,6,activation='relu',padding='same',strides=1)(embed) cnn2=MaxPooling1D(5)(cnn2) cnn3=Convolution1D(128,3,activation='relu',padding='same',strides=1)(embed) cnn3=MaxPooling1D(5)(cnn3) cnn=concatenate([cnn1,cnn2,cnn3]) flat=Flatten()(cnn) drop=Dropout(0.1)(flat) y=Dense(5,activation='softmax')(drop) model=Model(inputs=x,outputs=y) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) tensorboard=TensorBoard(log_dir='./logs',write_graph=True,write_grads=True,histogram_freq=True) model.fit(padded_X_train, train_y_ohe, epochs=5, batch_size=10000, verbose=1,callbacks=[tensorboard],validation_data=[padded_X_test,test_y_ohe]) '''pred0=model.predict_classes(padded_X,verbose=0) acc_train=np.sum(train_y==pred0,axis=0)/train_X.shape[0]'''

keras下self-attention和Recall, F1-socre值实现问题?

麻烦大神帮忙看一下: (1)为何返回不了Precise, Recall, F1-socre值? (2)为何在CNN前加了self-attention层,训练后的acc反而降低在0.78上下? 【研一小白求详解,万分感谢大神】 ``` import os #导入os模块,用于确认文件是否存在 import numpy as np from keras.preprocessing.text import Tokenizer from keras.preprocessing.sequence import pad_sequences from keras.callbacks import Callback from sklearn.metrics import f1_score, precision_score, recall_score maxlen = 380#句子长截断为100 training_samples = 20000#在 200 个样本上训练 validation_samples = 5000#在 10 000 个样本上验证 max_words = 10000#只考虑数据集中前 10 000 个最常见的单词 def dataProcess(): imdb_dir = 'data/aclImdb'#基本路径,经常要打开这个 #处理训练集 train_dir = os.path.join(imdb_dir, 'train')#添加子路径 train_labels = [] train_texts = [] for label_type in ['neg', 'pos']: dir_name = os.path.join(train_dir, label_type) for fname in os.listdir(dir_name):#获取目录下所有文件名字 if fname[-4:] == '.txt': f = open(os.path.join(dir_name, fname),'r',encoding='utf8') train_texts.append(f.read()) f.close() if label_type == 'neg': train_labels.append(0) else:train_labels.append(1) #处理测试集 test_dir = os.path.join(imdb_dir, 'test') test_labels = [] test_texts = [] for label_type in ['neg', 'pos']: dir_name = os.path.join(test_dir, label_type) for fname in sorted(os.listdir(dir_name)): if fname[-4:] == '.txt': f = open(os.path.join(dir_name, fname),'r',encoding='utf8') test_texts.append(f.read()) f.close() if label_type == 'neg': test_labels.append(0) else: test_labels.append(1) #对数据进行分词和划分训练集和数据集 tokenizer = Tokenizer(num_words=max_words) tokenizer.fit_on_texts(train_texts)#构建单词索引结构 sequences = tokenizer.texts_to_sequences(train_texts)#整数索引的向量化模型 word_index = tokenizer.word_index#索引字典 print('Found %s unique tokens.' % len(word_index)) data = pad_sequences(sequences, maxlen=maxlen) train_labels = np.asarray(train_labels)#把列表转化为数组 print('Shape of data tensor:', data.shape) print('Shape of label tensor:', train_labels.shape) indices = np.arange(data.shape[0])#评论顺序0,1,2,3 np.random.shuffle(indices)#把评论顺序打乱3,1,2,0 data = data[indices] train_labels = train_labels[indices] x_train = data[:training_samples] y_train = train_labels[:training_samples] x_val = data[training_samples: training_samples + validation_samples] y_val = train_labels[training_samples: training_samples + validation_samples] #同样需要将测试集向量化 test_sequences = tokenizer.texts_to_sequences(test_texts) x_test = pad_sequences(test_sequences, maxlen=maxlen) y_test = np.asarray(test_labels) return x_train,y_train,x_val,y_val,x_test,y_test,word_index embedding_dim = 100#特征数设为100 #"""将预训练的glove词嵌入文件,构建成可以加载到embedding层中的嵌入矩阵""" def load_glove(word_index):#导入glove的词向量 embedding_file='data/glove.6B' embeddings_index={}#定义字典 f = open(os.path.join(embedding_file, 'glove.6B.100d.txt'),'r',encoding='utf8') for line in f: values = line.split() word = values[0] coefs = np.asarray(values[1:], dtype='float32') embeddings_index[word] = coefs f.close() # """转化为矩阵:构建可以加载到embedding层中的嵌入矩阵,形为(max_words(单词数), embedding_dim(向量维数)) """ embedding_matrix = np.zeros((max_words, embedding_dim)) for word, i in word_index.items():#字典里面的单词和索引 if i >= max_words:continue embedding_vector = embeddings_index.get(word) if embedding_vector is not None: embedding_matrix[i] = embedding_vector return embedding_matrix if __name__ == '__main__': x_train, y_train, x_val, y_val,x_test,y_test, word_index = dataProcess() embedding_matrix=load_glove(word_index) #可以把得到的嵌入矩阵保存起来,方便后面fine-tune""" # #保存 from keras.models import Sequential from keras.layers.core import Dense,Dropout,Activation,Flatten from keras.layers.recurrent import LSTM from keras.layers import Embedding from keras.layers import Bidirectional from keras.layers import Conv1D, MaxPooling1D import keras from keras_self_attention import SeqSelfAttention model = Sequential() model.add(Embedding(max_words, embedding_dim, input_length=maxlen)) model.add(SeqSelfAttention(attention_activation='sigmod')) model.add(Conv1D(filters = 64, kernel_size = 5, padding = 'same', activation = 'relu')) model.add(MaxPooling1D(pool_size = 4)) model.add(Dropout(0.25)) model.add(Bidirectional(LSTM(64,activation='tanh',dropout=0.2,recurrent_dropout=0.2))) model.add(Dense(256, activation='relu')) model.add(Dropout(0.2)) model.add(Dense(1, activation='sigmoid')) model.summary() model.layers[0].set_weights([embedding_matrix]) model.layers[0].trainable = False model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc']) class Metrics(Callback): def on_train_begin(self, logs={}): self.val_f1s = [] self.val_recalls = [] self.val_precisions = [] def on_epoch_end(self, epoch, logs={}): val_predict = (np.asarray(self.model.predict(self.validation_data[0]))).round() val_targ = self.validation_data[1] _val_f1 = f1_score(val_targ, val_predict) _val_recall = recall_score(val_targ, val_predict) _val_precision = precision_score(val_targ, val_predict) self.val_f1s.append(_val_f1) self.val_recalls.append(_val_recall) self.val_precisions.append(_val_precision) return metrics = Metrics() history = model.fit(x_train, y_train, epochs=10, batch_size=32, validation_data=(x_val, y_val), callbacks=[metrics]) model.save_weights('pre_trained_glove_model.h5')#保存结果 ```

tf.keras 关于 胶囊网络 capsule的问题

``` from tensorflow.keras import backend as K from tensorflow.keras.layers import Layer from tensorflow.keras import activations from tensorflow.keras import utils from tensorflow.keras.models import Model from tensorflow.keras.layers import * from tensorflow.keras.preprocessing.image import ImageDataGenerator from tensorflow.keras.callbacks import TensorBoard import mnist import tensorflow batch_size = 128 num_classes = 10 epochs = 20 """ 压缩函数,我们使用0.5替代hinton论文中的1,如果是1,所有的向量的范数都将被缩小。 如果是0.5,小于0.5的范数将缩小,大于0.5的将被放大 """ def squash(x, axis=-1): s_quared_norm = K.sum(K.square(x), axis, keepdims=True) + K.epsilon() scale = K.sqrt(s_quared_norm) / (0.5 + s_quared_norm) result = scale * x return result # 定义我们自己的softmax函数,而不是K.softmax.因为K.softmax不能指定轴 def softmax(x, axis=-1): ex = K.exp(x - K.max(x, axis=axis, keepdims=True)) result = ex / K.sum(ex, axis=axis, keepdims=True) return result # 定义边缘损失,输入y_true, p_pred,返回分数,传入即可fit时候即可 def margin_loss(y_true, y_pred): lamb, margin = 0.5, 0.1 result = K.sum(y_true * K.square(K.relu(1 - margin -y_pred)) + lamb * (1-y_true) * K.square(K.relu(y_pred - margin)), axis=-1) return result class Capsule(Layer): """编写自己的Keras层需要重写3个方法以及初始化方法 1.build(input_shape):这是你定义权重的地方。 这个方法必须设self.built = True,可以通过调用super([Layer], self).build()完成。 2.call(x):这里是编写层的功能逻辑的地方。 你只需要关注传入call的第一个参数:输入张量,除非你希望你的层支持masking。 3.compute_output_shape(input_shape): 如果你的层更改了输入张量的形状,你应该在这里定义形状变化的逻辑,这让Keras能够自动推断各层的形状。 4.初始化方法,你的神经层需要接受的参数 """ def __init__(self, num_capsule, dim_capsule, routings=3, share_weights=True, activation='squash', **kwargs): super(Capsule, self).__init__(**kwargs) # Capsule继承**kwargs参数 self.num_capsule = num_capsule self.dim_capsule = dim_capsule self.routings = routings self.share_weights = share_weights if activation == 'squash': self.activation = squash else: self.activation = activation.get(activation) # 得到激活函数 # 定义权重 def build(self, input_shape): input_dim_capsule = input_shape[-1] if self.share_weights: # 自定义权重 self.kernel = self.add_weight( name='capsule_kernel', shape=(1, input_dim_capsule, self.num_capsule * self.dim_capsule), initializer='glorot_uniform', trainable=True) else: input_num_capsule = input_shape[-2] self.kernel = self.add_weight( name='capsule_kernel', shape=(input_num_capsule, input_dim_capsule, self.num_capsule * self.dim_capsule), initializer='glorot_uniform', trainable=True) super(Capsule, self).build(input_shape) # 必须继承Layer的build方法 # 层的功能逻辑(核心) def call(self, inputs): if self.share_weights: hat_inputs = K.conv1d(inputs, self.kernel) else: hat_inputs = K.local_conv1d(inputs, self.kernel, [1], [1]) batch_size = K.shape(inputs)[0] input_num_capsule = K.shape(inputs)[1] hat_inputs = K.reshape(hat_inputs, (batch_size, input_num_capsule, self.num_capsule, self.dim_capsule)) hat_inputs = K.permute_dimensions(hat_inputs, (0, 2, 1, 3)) b = K.zeros_like(hat_inputs[:, :, :, 0]) for i in range(self.routings): c = softmax(b, 1) o = self.activation(K.batch_dot(c, hat_inputs, [2, 2])) if K.backend() == 'theano': o = K.sum(o, axis=1) if i < self.routings-1: b += K.batch_dot(o, hat_inputs, [2, 3]) if K.backend() == 'theano': o = K.sum(o, axis=1) return o def compute_output_shape(self, input_shape): # 自动推断shape return (None, self.num_capsule, self.dim_capsule) def MODEL(): input_image = Input(shape=(32, 32, 3)) x = Conv2D(64, (3, 3), activation='relu')(input_image) x = Conv2D(64, (3, 3), activation='relu')(x) x = AveragePooling2D((2, 2))(x) x = Conv2D(128, (3, 3), activation='relu')(x) x = Conv2D(128, (3, 3), activation='relu')(x) """ 现在我们将它转换为(batch_size, input_num_capsule, input_dim_capsule),然后连接一个胶囊神经层。模型的最后输出是10个维度为16的胶囊网络的长度 """ x = Reshape((-1, 128))(x) # (None, 100, 128) 相当于前一层胶囊(None, input_num, input_dim) capsule = Capsule(num_capsule=10, dim_capsule=16, routings=3, share_weights=True)(x) # capsule-(None,10, 16) output = Lambda(lambda z: K.sqrt(K.sum(K.square(z), axis=2)))(capsule) # 最后输出变成了10个概率值 model = Model(inputs=input_image, output=output) return model if __name__ == '__main__': # 加载数据 (x_train, y_train), (x_test, y_test) = mnist.load_data() x_train = x_train.astype('float32') x_test = x_test.astype('float32') x_train /= 255 x_test /= 255 y_train = tensorflow.keras.utils.to_categorical(y_train, num_classes) y_test = tensorflow.keras.utils.to_categorical(y_test, num_classes) # 加载模型 model = MODEL() model.compile(loss=margin_loss, optimizer='adam', metrics=['accuracy']) model.summary() tfck = TensorBoard(log_dir='capsule') # 训练 data_augmentation = True if not data_augmentation: print('Not using data augmentation.') model.fit( x_train, y_train, batch_size=batch_size, epochs=epochs, validation_data=(x_test, y_test), callbacks=[tfck], shuffle=True) else: print('Using real-time data augmentation.') # This will do preprocessing and realtime data augmentation: datagen = ImageDataGenerator( featurewise_center=False, # set input mean to 0 over the dataset samplewise_center=False, # set each sample mean to 0 featurewise_std_normalization=False, # divide inputs by dataset std samplewise_std_normalization=False, # divide each input by its std zca_whitening=False, # apply ZCA whitening rotation_range=0, # randomly rotate images in 0 to 180 degrees width_shift_range=0.1, # randomly shift images horizontally height_shift_range=0.1, # randomly shift images vertically horizontal_flip=True, # randomly flip images vertical_flip=False) # randomly flip images # Compute quantities required for feature-wise normalization # (std, mean, and principal components if ZCA whitening is applied). datagen.fit(x_train) # Fit the model on the batches generated by datagen.flow(). model.fit_generator( datagen.flow(x_train, y_train, batch_size=batch_size), epochs=epochs, validation_data=(x_test, y_test), callbacks=[tfck], workers=4) ``` 以上为代码 运行后出现该问题 ![图片说明](https://img-ask.csdn.net/upload/201902/26/1551184741_476774.png) ![图片说明](https://img-ask.csdn.net/upload/201902/26/1551184734_845838.png) 用官方的胶囊网络keras实现更改为tf下的keras实现仍出现该错误。

Tensorflow 多GPU并行训练,模型收敛su'du'ma

在使用多GPU并行训练深度学习神经网络时,以TFrecords 形式读取MNIST数据的训练数据进行训练,发现比直接用MNIST训练数据训练相同模型时,发现前者收敛速度慢和运行时间长,已知模型没有问题,想请大神帮忙看看是什么原因导致运行速度慢,运行时间长 ``` import os import time import numpy as np import tensorflow as tf from datetime import datetime import tensorflow.compat.v1 as v1 from tensorflow.examples.tutorials.mnist import input_data BATCH_SIZE = 100 LEARNING_RATE = 1e-4 LEARNING_RATE_DECAY = 0.99 REGULARZTION_RATE = 1e-4 EPOCHS = 10000 MOVING_AVERAGE_DECAY = 0.99 N_GPU = 2 MODEL_SAVE_PATH = r'F:\model\log_dir' MODEL_NAME = 'model.ckpt' TRAIN_PATH = r'F:\model\threads_file\MNIST_data_tfrecords\train.tfrecords' TEST_PATH = r'F:\model\threads_file\MNIST_data_tfrecords\test.tfrecords' def __int64_feature(value): return v1.train.Feature(int64_list=v1.train.Int64List(value=[value])) def __bytes_feature(value): return v1.train.Feature(bytes_list=v1.train.BytesList(value=[value])) def creat_tfrecords(path, data, labels): writer = tf.io.TFRecordWriter(path) for i in range(len(data)): image = data[i].tostring() label = labels[i] examples = v1.train.Example(features=v1.train.Features(feature={ 'image': __bytes_feature(image), 'label': __int64_feature(label) })) writer.write(examples.SerializeToString()) writer.close() def parser(record): features = v1.parse_single_example(record, features={ 'image': v1.FixedLenFeature([], tf.string), 'label': v1.FixedLenFeature([], tf.int64) }) image = tf.decode_raw(features['image'], tf.uint8) image = tf.reshape(image, [28, 28, 1]) image = tf.cast(image, tf.float32) label = tf.cast(features['label'], tf.int32) label = tf.one_hot(label, 10, on_value=1, off_value=0) return image, label def get_input(batch_size, path): dataset = tf.data.TFRecordDataset([path]) dataset = dataset.map(parser) dataset = dataset.shuffle(10000) dataset = dataset.repeat(100) dataset = dataset.batch(batch_size) iterator = dataset.make_one_shot_iterator() image, label = iterator.get_next() return image, label def model_inference(images, labels, rate, regularzer=None, reuse_variables=None): with v1.variable_scope(v1.get_variable_scope(), reuse=reuse_variables): with tf.compat.v1.variable_scope('First_conv'): w1 = tf.compat.v1.get_variable('weights', [3, 3, 1, 32], tf.float32, initializer=tf.compat.v1.truncated_normal_initializer(stddev=0.1)) if regularzer: tf.add_to_collection('losses', regularzer(w1)) b1 = tf.compat.v1.get_variable('biases', [32], tf.float32, initializer=tf.compat.v1.constant_initializer(0.1)) activation1 = tf.nn.relu(tf.nn.conv2d(images, w1, strides=[1, 1, 1, 1], padding='SAME') + b1) out1 = tf.nn.max_pool2d(activation1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') with tf.compat.v1.variable_scope('Second_conv'): w2 = tf.compat.v1.get_variable('weight', [3, 3, 32, 64], tf.float32, initializer=tf.compat.v1.truncated_normal_initializer(stddev=0.1)) if regularzer: tf.add_to_collection('losses', regularzer(w2)) b2 = tf.compat.v1.get_variable('biases', [64], tf.float32, initializer=tf.compat.v1.constant_initializer(0.1)) activation2 = tf.nn.relu(tf.nn.conv2d(out1, w2, strides=[1, 1, 1, 1], padding='SAME') + b2) out2 = tf.nn.max_pool2d(activation2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') out3 = tf.reshape(out2, [-1, 7*7*64], name='flatten') with tf.compat.v1.variable_scope('FC_1'): w3 = tf.compat.v1.get_variable('weight', [7*7*64, 1024], tf.float32, initializer=tf.compat.v1.truncated_normal_initializer(stddev=0.1)) if regularzer: tf.add_to_collection('losses', regularzer(w3)) b3 = tf.compat.v1.get_variable('biases', [1024], tf.float32, initializer=tf.compat.v1.constant_initializer(0.1)) activation3 = tf.nn.relu(tf.matmul(out3, w3) + b3) out4 = tf.nn.dropout(activation3, keep_prob=rate) with tf.compat.v1.variable_scope('FC_2'): w4 = tf.compat.v1.get_variable('weight', [1024, 10], tf.float32, initializer=tf.compat.v1.truncated_normal_initializer(stddev=0.1)) if regularzer: tf.add_to_collection('losses', regularzer(w4)) b4 = tf.compat.v1.get_variable('biases', [10], tf.float32, initializer=tf.compat.v1.constant_initializer(0.1)) output = tf.nn.softmax(tf.matmul(out4, w4) + b4) with tf.compat.v1.variable_scope('Loss_entropy'): if regularzer: loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=tf.argmax(labels, 1), logits=output)) \ + tf.add_n(tf.get_collection('losses')) else: loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=tf.argmax(labels, 1), logits=output)) with tf.compat.v1.variable_scope('Accuracy'): correct_data = tf.equal(tf.math.argmax(labels, 1), tf.math.argmax(output, 1)) accuracy = tf.reduce_mean(tf.cast(correct_data, tf.float32, name='accuracy')) return output, loss, accuracy def average_gradients(tower_grads): average_grads = [] for grad_and_vars in zip(*tower_grads): grads = [] for g, v2 in grad_and_vars: expanded_g = tf.expand_dims(g, 0) grads.append(expanded_g) grad = tf.concat(grads, 0) grad = tf.reduce_mean(grad, 0) v = grad_and_vars[0][1] grad_and_var = (grad, v) average_grads.append(grad_and_var) return average_grads def main(argv=None): with tf.Graph().as_default(), tf.device('/cpu:0'): x, y = get_input(batch_size=BATCH_SIZE, path=TRAIN_PATH) regularizer = tf.contrib.layers.l2_regularizer(REGULARZTION_RATE) global_step = v1.get_variable('global_step', [], initializer=v1.constant_initializer(0), trainable=False) lr = v1.train.exponential_decay(LEARNING_RATE, global_step, 55000/BATCH_SIZE, LEARNING_RATE_DECAY) opt = v1.train.AdamOptimizer(lr) tower_grads = [] reuse_variables = False device = ['/gpu:0', '/cpu:0'] for i in range(len(device)): with tf.device(device[i]): with v1.name_scope(device[i][1:4] + '_0') as scope: out, cur_loss, acc = model_inference(x, y, 0.3, regularizer, reuse_variables) reuse_variables = True grads = opt.compute_gradients(cur_loss) tower_grads.append(grads) grads = average_gradients(tower_grads) for grad, var in grads: if grad is not None: v1.summary.histogram('gradients_on_average/%s' % var.op.name, grad) apply_gradient_op = opt.apply_gradients(grads, global_step) for var in v1.trainable_variables(): tf.summary.histogram(var.op.name, var) variable_averages = v1.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY, global_step) variable_to_average = (v1.trainable_variables() + v1.moving_average_variables()) variable_averages_op = variable_averages.apply(variable_to_average) train_op = tf.group(apply_gradient_op, variable_averages_op) saver = v1.train.Saver(max_to_keep=1) summary_op = v1.summary.merge_all() # merge_all 可以将所有summary全部保存到磁盘 init = v1.global_variables_initializer() with v1.Session(config=v1.ConfigProto(allow_soft_placement=True, log_device_placement=True)) as sess: init.run() summary_writer = v1.summary.FileWriter(MODEL_SAVE_PATH, sess.graph) # 指定一个文件用来保存图 for step in range(EPOCHS): try: start_time = time.time() _, loss_value, out_value, acc_value = sess.run([train_op, cur_loss, out, acc]) duration = time.time() - start_time if step != 0 and step % 100 == 0: num_examples_per_step = BATCH_SIZE * N_GPU examples_per_sec = num_examples_per_step / duration sec_per_batch = duration / N_GPU format_str = '%s: step %d, loss = %.2f(%.1f examples/sec; %.3f sec/batch), accuracy = %.2f' print(format_str % (datetime.now(), step, loss_value, examples_per_sec, sec_per_batch, acc_value)) summary = sess.run(summary_op) summary_writer.add_summary(summary, step) if step % 100 == 0 or (step + 1) == EPOCHS: checkpoint_path = os.path.join(MODEL_SAVE_PATH, MODEL_NAME) saver.save(sess, checkpoint_path, global_step=step) except tf.errors.OutOfRangeError: break if __name__ == '__main__': tf.app.run() ```

deeplab v3+训练loss不收敛问题

* 我使用的是官网的代码https://github.com/tensorflow/models/tree/master/research/deeplab 复现deeplab v3+; * 训练数据就是标准的Pascal voc2012。训练之前已经按照官网上的说法,通过运行脚本download_and_convert_voc2012.sh下载voc2012数据、并将label转换为单通道、并将数据转换为需要的tfrecord格式; * 训练模型也是从提供的model_zoo下载的https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/model_zoo.md; * 学习率保持默认,即learning rate=0.0001; * Linux Ubuntu 16.04;TensorFlow1.6.0 installed from Anaconda;CUDA9.0/cudnn7.0.5;GeForce GTX 1080 Ti; * 具体训练代码是: ``` python deeplab/train.py \ --logtostderr \ --training_number_of_steps=30000 \ --train_split="train" \ --model_variant="xception_65" \ --atrous_rates=6 \ --atrous_rates=12 \ --atrous_rates=18 \ --output_stride=16 \ --decoder_output_stride=4 \ --train_crop_size=513 \ --train_crop_size=513 \ --train_batch_size=2 \ --dataset="pascal_voc_seg" \ --fine_tune_batch_norm = False \ --tf_initial_checkpoint="{下载的checkpoint路径}/deeplabv3_pascal_train_aug/model.ckpt.index" \ --train_logdir="{要写入路径}/exp/train_on_train_set/train" \ --dataset_dir="{数据集路径}/pascal_voc_seg/tfrecord" ``` * 然而loss一直不收敛:![图片说明](https://img-ask.csdn.net/upload/201812/25/1545727654_85154.png) * 最终出现nan值错误![图片说明](https://img-ask.csdn.net/upload/201812/25/1545727689_745383.png) * 如果训练的次数少一点,验证一下结果,发现miou只有零点零几:![图片说明](https://img-ask.csdn.net/upload/201812/26/1545785941_997452.jpg) * 一直没有找到原因,感觉步骤没有问题,也参照过各种博客,大家似乎都没有出现这种情况,希望大佬们可以帮忙

迁移学习中进行医学影像分析,训练神经网络后accuracy保持不变。。。

使用的是vgg16的finetune网络,网络权重从keras中导入的,最上层有三层的一个小训练器的权重是由训练习得的。训练集大约300个样本,验证集大约80个样本,但程序运行后,第一二个epoch之间loss、acc还有变化,之后就不再变化,而且验证集的准确度一直接近于零。。想请有关卷积神经网络和机器学习方面的大神帮忙看一下是哪里出了问题 import keras from keras.models import Sequential from keras.layers import Dense,Dropout,Activation,Flatten from keras.layers import GlobalAveragePooling2D import numpy as np from keras.optimizers import RMSprop from keras.utils import np_utils import matplotlib.pyplot as plt from keras import regularizers from keras.applications.vgg16 import VGG16 from keras import optimizers from keras.layers.core import Lambda from keras import backend as K from keras.models import Model #写一个LossHistory类(回调函数),保存loss和acc,在keras下画图 class LossHistory(keras.callbacks.Callback): def on_train_begin(self, logs={}):#在每个batch的开始处(on_batch_begin):logs包含size,即当前batch的样本数 self.losses = {'batch':[], 'epoch':[]} self.accuracy = {'batch':[], 'epoch':[]} self.val_loss = {'batch':[], 'epoch':[]} self.val_acc = {'batch':[], 'epoch':[]} def on_batch_end(self, batch, logs={}): self.losses['batch'].append(logs.get('loss')) self.accuracy['batch'].append(logs.get('acc')) self.val_loss['batch'].append(logs.get('val_loss')) self.val_acc['batch'].append(logs.get('val_acc')) def on_epoch_end(self, batch, logs={}):#每迭代完一次从log中取得数据 self.losses['epoch'].append(logs.get('loss')) self.accuracy['epoch'].append(logs.get('acc')) self.val_loss['epoch'].append(logs.get('val_loss')) self.val_acc['epoch'].append(logs.get('val_acc')) def loss_plot(self, loss_type): iters = range(len(self.losses[loss_type])) #绘图的横坐标? plt.figure() #建立一个空的画布 if loss_type == 'epoch': plt.subplot(211) plt.plot(iters,self.accuracy[loss_type],'r',label='train acc') plt.plot(iters,self.val_acc[loss_type],'b',label='val acc') # val_acc用蓝色线表示 plt.grid(True) plt.xlabel(loss_type) plt.ylabel('accuracy') plt.show() plt.subplot(212) plt.plot(iters, self.losses[loss_type], 'r', label='train loss') # val_acc 用蓝色线表示 plt.plot(iters, self.val_loss[loss_type], 'b', label='val loss') # val_loss 用黑色线表示 plt.xlabel(loss_type) plt.ylabel('loss') plt.legend(loc="upper right") #把多个axs的图例放在一张图上,loc表示位置 plt.show() print(np.mean(self.val_acc[loss_type])) print(np.std(self.val_acc[loss_type])) seed = 7 np.random.seed(seed) #训练网络的几个参数 batch_size=32 num_classes=2 epochs=100 weight_decay=0.0005 learn_rate=0.0001 #读入训练、测试数据,改变大小,显示基本信息 X_train=np.load(open('/image_BRATS_240_240_3_normal.npy',mode='rb')) Y_train=np.load(open('/label_BRATS_240_240_3_normal.npy',mode='rb')) Y_train = keras.utils.to_categorical(Y_train, 2) #搭建神经网络 model_vgg16=VGG16(include_top=False,weights='imagenet',input_shape=(240,240,3),classes=2) model_vgg16.layers.pop() model=Sequential() model.add(model_vgg16) model.add(Flatten(input_shape=X_train.shape[1:])) model.add(Dense(436,activation='relu')) #return x*10的向量 model.add(Dense(2,activation='softmax')) #model(inputs=model_vgg16.input,outputs=predictions) for layer in model_vgg16.layers[:13]: layer.trainable=False model_vgg16.summary() model.compile(optimizer=RMSprop(lr=learn_rate,decay=weight_decay), loss='categorical_crossentropy', metrics=['accuracy']) model.summary() history=LossHistory() model.fit(X_train,Y_train, batch_size=batch_size,epochs=epochs, verbose=1, shuffle=True, validation_split=0.2, callbacks=[history]) #模型评估 history.loss_plot('epoch') 比如: ![实验运行结果:](https://img-ask.csdn.net/upload/201804/19/1524134477_869793.png)

神经网络GPU运算内存不足

# 类型一(通过自动求导来优化模型参数) ``` import tensorflow as tf from tensorflow import keras from tensorflow.keras import optimizers, datasets from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D from sklearn.model_selection import train_test_split import matplotlib.pyplot as plt import os 提取mnist数据集 def mnist_dataset(): (x, y), (x_test, y_test) = datasets.mnist.load_data() x_train,x_valid,y_train,y_valid = train_test_split(x, y, test_size=0.2) #划分验证集 #Normalize归一化 x_train = tf.cast(x_train/255.0, dtype=tf.float32) x_valid = tf.cast(x_valid/255.0, dtype=tf.float32) x_test = tf.cast(x_test/255.0, dtype=tf.float32) #增加维度:( , , )-->( , , , ) x_train = tf.expand_dims(x_train, axis=3) x_valid = tf.expand_dims(x_valid, axis=3) x_test = tf.expand_dims(x_test, axis=3) #对标签数据进行独热编码 y_train = tf.one_hot(y_train, depth=10, dtype=tf.float32) y_valid = tf.one_hot(y_valid, depth=10, dtype=tf.float32) y_test = tf.one_hot(y_test, depth=10, dtype=tf.float32) return (x_train, y_train), (x_valid, y_valid), (x_test, y_test) #定义模型 class Convolution_NN(keras.Model): def __init__(self): super(Convolution_NN, self).__init__() # super(): https://wiki.jikexueyuan.com/project/explore-python/Class/super.html self.L1_conv = Conv2D(filters=10, kernel_size=(5, 5), activation='relu', padding='same') self.L2_conv = Conv2D(filters=10, kernel_size=(5, 5), activation='relu', padding='same') self.pool = MaxPooling2D(pool_size=(2, 2), strides=2) self.flat = Flatten() self.dense1 = Dense(100, activation='tanh') self.dense2 = Dense(10, activation='softmax') def call(self, inputs): h1 = self.L1_conv(inputs) h1_pool = self.pool(h1) h2 = self.L2_conv(h1_pool) h2_pool = self.pool(h2) flat_h = self.flat(h2_pool) dense1 = self.dense1(flat_h) logits = self.dense2(dense1) return logits #定义交叉熵损失函数 def compute_loss(logits, labels): return tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels, logits)) #定义预测准确率函数 def compute_accuracy(logits, labels): predictions = tf.argmax(logits, axis=1) labels = tf.argmax(labels, axis=1) return tf.reduce_mean(tf.cast(tf.equal(predictions, labels), tf.float32)) #参数优化 def train_one_step(model, optimizer, x, y): with tf.GradientTape() as tape: logits = model(x) loss = compute_loss(logits, y) #compute gradient grads = tape.gradient(loss, model.trainable_variables) #update to weights optimizer.apply_gradients(zip(grads, model.trainable_variables)) #------------------------------ if __name__ == '__main__': (x_train, y_train), (x_valid, y_valid), (x_test, y_test) = mnist_dataset() #设置训练超参数 training_epochs = 20 #训练轮数 batch_size = 50 #单次训练的样本数(批次的大小) Mini-Batch优化 learning_rate = 0.001 #学习率 model = Convolution_NN() optimizer = optimizers.Adam(learning_rate=learning_rate) steps = int(x_train.shape[0]/batch_size) #一轮训练的批次 for epoch in range(training_epochs): for step in range(steps): X = x_train[step*batch_size:(step+1)*batch_size] Y = y_train[step*batch_size:(step+1)*batch_size] train_one_step(model, optimizer, X, Y) ``` # 类型二(通过tf的高阶API-Keras来训练模型参数) ``` import tensorflow as tf from tensorflow import keras from tensorflow.keras import optimizers, datasets from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D from sklearn.model_selection import train_test_split import matplotlib.pyplot as plt import os #提取mnist数据集 def mnist_dataset(): (x_train, y_train), (x_test, y_test) = datasets.mnist.load_data() #Normalize归一化 x_train = tf.cast(x_train/255.0, dtype=tf.float32) x_test = tf.cast(x_test/255.0, dtype=tf.float32) #增加维度:( , , )-->( , , , ) x_train = tf.expand_dims(x_train, axis=3) x_test = tf.expand_dims(x_test, axis=3) #对标签数据进行独热编码 y_train = tf.one_hot(y_train, depth=10, dtype=tf.float32) y_test = tf.one_hot(y_test, depth=10, dtype=tf.float32) return (x_train, y_train), (x_test, y_test) #定义模型 class Convolution_NN(keras.Model): def __init__(self): super(Convolution_NN, self).__init__() # super(): https://wiki.jikexueyuan.com/project/explore-python/Class/super.html self.L1_conv = Conv2D(filters=10, kernel_size=(5, 5), activation='relu', padding='same') self.L2_conv = Conv2D(filters=10, kernel_size=(5, 5), activation='relu', padding='same') self.pool = MaxPooling2D(pool_size=(2, 2), strides=2) self.flat = Flatten() self.dense1 = Dense(100, activation='tanh') self.dense2 = Dense(10, activation='softmax') def call(self, inputs): h1 = self.L1_conv(inputs) h1_pool = self.pool(h1) h2 = self.L2_conv(h1_pool) h2_pool = self.pool(h2) flat_h = self.flat(h2_pool) dense1 = self.dense1(flat_h) logits = self.dense2(dense1) return logits #------------------------------ if __name__ == '__main__': #os.environ["CUDA_VISIBLE_DEVICES"] = "-1" #GPU内存不足(降低batch_size),改用CPU运算 (x_train, y_train), (x_test, y_test) = mnist_dataset() model = Convolution_NN() optimizer = optimizers.Adam() model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy']) #设置训练超参数 training_epochs = 20 #训练轮数 batch_size = 50 #单次训练的样本数(批次的大小) Mini-Batch优化 #训练模型 train_history = model.fit(x_train, y_train, validation_split=0.2, epochs=training_epochs, batch_size=batch_size, verbose=2) ``` ## 一句话总结:类型一,我自己优化参数,在GPU上跑,显示:OOM when allocating tensor with shape[48000,28,28,10] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [Op:Conv2D],即内存不足。 类型二,无脑调用API优化参数,很流畅的在GPU上跑。 为什么呢?很迷惑!讲道理Keras优化参数的方法应该和我一致呀,只是它的封装好了,为什么它的可以在GPU上跑,我的就显示内存不足呢?

optim.compute_gradients计算梯度 ,为什么返回的第一列为None?

1.问题描述 模型通过keras.models.Sequential构建 loss => tf.losses.sparse_softmax_cross_entropy 通过 var_list=tf.trainable_variables() 获取可训练变量 计算梯度值: loss_op = tf.losses.sparse_softmax_cross_entropy(y, y_pred) grads_vars = optim.compute_gradients(loss_op, tf.trainable_variables()) grads_vars返回的第一元素为 None,导致面的程序异常。 为什么grads_vars第一列返回的元素为None? 2.相关代码 ``` import tensorflow as tf import numpy as np import time import keras # 加载数据集 x_dataset=np.random.rand(1000,28,28,1) y_dataset=np.random.randint(0,10,size=(1000,)) act = tf.nn.leaky_relu epoch = 200 batch_size = 5000 n_batch = len(x_dataset) // batch_size # 把 batch 分成多少个 sub batch 来计算 subdivisions = 50 subdivisions_batch_size = int(np.ceil(batch_size / subdivisions)) # 是否使用 sub batch 方法,设置为 False 代表使用默认方法 is_on_subdivisions = True def get_model(is_train=True, reuse=False): with tf.variable_scope('model', reuse=reuse): net = keras.models.Sequential() net.add(keras.layers.Conv2D(128,(3,3),input_shape=(28,28,1),strides=(2,2),padding='same',name='c1')) net.add(keras.layers.GlobalAveragePooling2D()) net.add(keras.layers.Dense(10)) return net x = tf.placeholder(tf.float32, [None, 28, 28, 1]) y = tf.placeholder(tf.int32, [None,]) net = get_model() y_pred=tf.cast(tf.argmax(net.outputs[0],axis=-1),dtype=tf.float32) loss_op = tf.losses.sparse_softmax_cross_entropy(y, y_pred) optim = tf.train.AdamOptimizer(0.01) var_list=tf.trainable_variables() grads_vars = optim.compute_gradients(loss_op, tf.trainable_variables()) #grads_vars返回的第一列为None,为什么? for gv in grads_vars: print(gv) # 删掉没梯度的参数, 倒序删除,减少麻烦 for i in range(len(grads_vars))[::-1]: if grads_vars[i][0] is None: del grads_vars[i] #因为返回的第一列为None,所以所有变量都被删除了,导致后面的异常! print('len(grads_vars):',len(grads_vars)) # 生成梯度缓存:grads_vars第一列为None触发异常 grads_cache = [tf.Variable(np.zeros(t[0].shape.as_list(), np.float32), trainable=False) for t in grads_vars] # 清空梯度缓存op,每一 batch 开始前调用 clear_grads_cache_op = tf.group([gc.assign(tf.zeros_like(gc)) for gc in grads_cache]) # 累积梯度op,累积每个 sub batch 的梯度 accumulate_grad_op = tf.group([gc.assign_add(gv[0]) for gc, gv in zip(grads_cache, grads_vars)]) # 求平均梯度, mean_grad = [gc/tf.to_float(subdivisions) for gc in grads_cache] # 组装梯度列表 new_grads_vars = [(g, gv[1]) for g, gv in zip(mean_grad, grads_vars)] # 应用梯度op,累积完所有 sub batch 的梯度后,应用梯度 apply_grad_op = optim.apply_gradients(new_grads_vars) # 原来的 optim ,跟上面做对照 ori_optim_op = tf.train.AdamOptimizer(0.01).minimize(loss_op, var_list=net.all_params) config = tf.ConfigProto() config.gpu_options.allow_growth = True config.allow_soft_placement = True sess = tf.Session(config=config) sess.run(tf.global_variables_initializer()) for e in range(epoch): loss_sum = 0 for b in progressbar(range(n_batch)): x_batch = x_dataset[b * batch_size: (b + 1) * batch_size] y_batch = y_dataset[b * batch_size: (b + 1) * batch_size] if is_on_subdivisions: # 每一批开始前需要清空梯度缓存 sess.run(clear_grads_cache_op) sub_loss_sum = 0 for s in range(subdivisions): x_sub_batch = x_batch[s * subdivisions_batch_size: (s + 1) * subdivisions_batch_size] y_sub_batch = y_batch[s * subdivisions_batch_size: (s + 1) * subdivisions_batch_size] if len(x_sub_batch) == 0: break feed_dict = {x: x_sub_batch, y: y_sub_batch} _, los = sess.run([accumulate_grad_op, loss_op], feed_dict) sub_loss_sum += los loss_sum += sub_loss_sum / subdivisions # 梯度累积完成,开始应用梯度 sess.run(apply_grad_op) # 本批次结束 else: feed_dict = {x: x_batch, y: y_batch} _, los = sess.run([ori_optim_op, loss_op], feed_dict) loss_sum += los time.sleep(0.2) print('loss', loss_sum / n_batch) ``` 3.报错信息 ``` grads_vars: (None, <tf.Variable 'model/c1/kernel:0' shape=(3, 3, 1, 128) dtype=float32_ref>) (None, <tf.Variable 'model/c1/bias:0' shape=(128,) dtype=float32_ref>) (None, <tf.Variable 'model/dense_1/kernel:0' shape=(128, 10) dtype=float32_ref>) (None, <tf.Variable 'model/dense_1/bias:0' shape=(10,) dtype=float32_ref>) (None, <tf.Variable 'model_1/c1/kernel:0' shape=(3, 3, 1, 128) dtype=float32_ref>) (None, <tf.Variable 'model_1/c1/bias:0' shape=(128,) dtype=float32_ref>) (None, <tf.Variable 'model_1/dense_2/kernel:0' shape=(128, 10) dtype=float32_ref>) (None, <tf.Variable 'model_1/dense_2/bias:0' shape=(10,) dtype=float32_ref>) (None, <tf.Variable 'model_2/c1/kernel:0' shape=(3, 3, 1, 128) dtype=float32_ref>) (None, <tf.Variable 'model_2/c1/bias:0' shape=(128,) dtype=float32_ref>) (None, <tf.Variable 'model_2/dense_3/kernel:0' shape=(128, 10) dtype=float32_ref>) (None, <tf.Variable 'model_2/dense_3/bias:0' shape=(10,) dtype=float32_ref>) len(grads_vars): 0 ``` 4.尝试过的方法方式 5.相关截图

为什么我使用TensorFlow2.0训练的时候loss的变化那么奇怪?

我使用tf2.0搭建了一个deepfm模型用来做一个二分类预测。 训练过程中train loss一直在下降,val loss却一直在上升,并且训练到一半内存就不够用了,这是怎么一回事? ``` Train on 19532 steps, validate on 977 steps Epoch 1/5 19532/19532 [==============================] - 549s 28ms/step - loss: 0.4660 - AUC: 0.8519 - val_loss: 1.0059 - val_AUC: 0.5829 Epoch 2/5 19532/19532 [==============================] - 522s 27ms/step - loss: 0.1861 - AUC: 0.9787 - val_loss: 1.7618 - val_AUC: 0.5590 Epoch 3/5 17150/19532 [=========================>....] - ETA: 1:06 - loss: 0.0877 - AUC: 0.9951 Process finished with exit code 137 ``` 还有个问题,我在设计过程中关闭了eager模式,必须使用了下面代码进行初始化: ``` sess.run([tf.compat.v1.global_variables_initializer(), tf.compat.v1.tables_initializer()]) ``` 但我的代码中使用了其他的初始化方法: ``` initializer = tf.keras.initializers.TruncatedNormal(stddev=stddev, seed=29) regularizer = tf.keras.regularizers.l2(l2_reg) .... dnn_hidden_layer_3 = tf.keras.layers.Dense(64, activation='selu', kernel_initializer=initializer, kernel_regularizer=regularizer)(dnn_hidden_layer_2) .... ``` 我这样做他还是按我定义的初始化方法那样初始化吗? 本人小白,在这里先跪谢大家了!

机器翻译训练模型时总是报错关于矩阵类型不匹配,求帮助

求帮助…… 小弟最近在做关于机器翻译的毕业设计,选题一时爽做题一直坑,遇到了这样的问题:TypeError: Input ‘b’ of ‘MatMul’ Op has type float32 that does not match type int32 of argument ‘a’ 我是现在github上找了一个使用tensorflow 和 tf.keras的代码学习,先跑一跑感受一下,使用的是tensorflow2.0.0版本,在训练模型时就会遇到这种情况,有没有大佬可以帮忙解答一下如何修改代码。非常感谢! 代码如下: ``` def simple_model(input_shape, output_sequence_length, english_vocab_size, french_vocab_size): """ Build and train a basic RNN on x and y :param input_shape: Tuple of input shape :param output_sequence_length: Length of output sequence :param english_vocab_size: Number of unique English words in the dataset :param french_vocab_size: Number of unique French words in the dataset :return: Keras model built, but not trained """ # TODO: Build the model learning_rate = 1e-3 input_seq = Input(input_shape[1:]) rnn = GRU(64, return_sequences=True)(input_seq) logits = TimeDistributed(Dense(french_vocab_size))(rnn) model = Model(input_seq, Activation('softmax')(logits)) model.summary() model.compile(loss=sparse_categorical_crossentropy, optimizer=Adam(learning_rate), metrics=['accuracy']) return model # Reshaping the input to work with a basic RNN tmp_x = pad(preproc_english_sentences, max_french_sequence_length) tmp_x = tmp_x.reshape((-1, preproc_french_sentences.shape[-2], 1)) print(tmp_x.shape) # Train the neural network simple_rnn_model = simple_model( tmp_x.shape, max_french_sequence_length, english_vocab_size, french_vocab_size) simple_rnn_model.fit(tmp_x, preproc_french_sentences, batch_size=1024, epochs=50, validation_split=0.2) # Print prediction(s) print("") print(logits_to_text(simple_rnn_model.predict(tmp_x[:1])[0], french_tokenizer)) ```

在GAN中,把真实的x和由生成器生成的G(z)同时输入到D中,输出的结果是一个值还是两个值?

在GAN中 把真实的x和由生成器生成的G(z)同时输入到D中 输出的结果是一个介于0~1的概率值还是两个0~1的概率值?

keras 运行cnn时报内存错误

如题,我早先自学的是tf,昨天入了一下keras的坑,没用服务器,用我这个丐版的联想本装了一个基于theano的keras,一开始跑了一个全连接的神经网络,没啥问题。然后又做了一个很小的cnn,(代码如下),能够用 model.summary()输出网络的结构,但是运行起来就会弹出信息框报错: 代码: ``` import keras import numpy as np from keras.models import load_model input1=keras.layers.Input(shape=(25,)) x=keras.layers.Reshape([5,5,1])(input1) x1=keras.layers.Conv2D(filters=2,kernel_size=(2,2),strides=(1,1),padding='valid',activation='elu')(x) x2=keras.layers.MaxPooling2D(pool_size=(2,2),strides=(1,1),padding='valid')(x1) x3=keras.layers.Conv2D(filters=4,kernel_size=(2,2),strides=(1,1),padding='valid',activation='elu')(x2) x4=keras.layers.AveragePooling2D(pool_size=(2,2),strides=(1,1),padding='valid')(x3) x5=keras.layers.Reshape([4*4*2,])(x1) xx=keras.layers.Dense(1,activation='elu')(x5) model=keras.models.Model(inputs=input1,outputs=xx) model.summary() model.compile(loss='mse', optimizer='sgd') def data(): data=np.random.randint(0,2,[1,25]) return(data) def num(data): data=np.reshape(data,[25]) sum_=0 for i in data: sum_=sum_+i if sum_>10: result=[[1]] else: result=[[0]] return(result) while True: for i in range(100): x=data() y=num(x) cost = model.train_on_batch([x], [y]) print(i) x=data() y=num(x) cost = model.evaluate(x, y) print('loss=',cost) x=data() y=num(x) print('x=',x) print('y=',y) Y_pred = model.predict(x) print(Y_pred) words=input('continue??\::') if words=='n': break ``` 可以输出模型的结构![图片说明](https://img-ask.csdn.net/upload/202001/07/1578376564_807468.png) 但是再往下运行,就会弹出信息框报错: ![图片说明](https://img-ask.csdn.net/upload/202001/07/1578376772_416127.png) 请问各位高手有何高见 我的电脑是xp系统,32位,内存不到1G(老掉牙的耍着玩),装的是python 2.7.15,numpy(1.16.6),scipy(1.2.2),theano(1.0.4),keras(2.3.1) 勿喷,一般都是在服务器上写tf,这台电脑纯属娱乐。。 求教求教。。。

吴恩达深度学习第四课第四周fr_utils.py报错,有人遇到过吗

Face Recognition/fr_utils.py, Line21中_get_session()和Line140中model无法找到引用,请问这是什么原因 加载模型时候会报如下错误: Using TensorFlow backend. 2018-08-26 21:30:53.046324: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 Total Params: 3743280 Traceback (most recent call last): File "C:/Users/51530/PycharmProjects/DL/wuenda/Face/faceV3.py", line 60, in <module> load_weights_from_FaceNet(FRmodel) File "C:\Users\51530\PycharmProjects\DL\wuenda\Face\fr_utils.py", line 133, in load_weights_from_FaceNet weights_dict = load_weights() File "C:\Users\51530\PycharmProjects\DL\wuenda\Face\fr_utils.py", line 154, in load_weights conv_w = genfromtxt(paths[name + '_w'], delimiter=',', dtype=None) File "E:\anaconda\lib\site-packages\numpy\lib\npyio.py", line 1867, in genfromtxt raise ValueError(errmsg) ValueError: Some errors were detected ! Line #7 (got 2 columns instead of 1) Line #12 (got 3 columns instead of 1) Line #15 (got 2 columns instead of 1) 具体此文件: ``` #### PART OF THIS CODE IS USING CODE FROM VICTOR SY WANG: https://github.com/iwantooxxoox/Keras-OpenFace/blob/master/utils.py #### import tensorflow as tf import numpy as np import os import cv2 from numpy import genfromtxt from keras.layers import Conv2D, ZeroPadding2D, Activation, Input, concatenate from keras.models import Model from keras.layers.normalization import BatchNormalization from keras.layers.pooling import MaxPooling2D, AveragePooling2D import h5py import matplotlib.pyplot as plt _FLOATX = 'float32' def variable(value, dtype=_FLOATX, name=None): v = tf.Variable(np.asarray(value, dtype=dtype), name=name) _get_session().run(v.initializer) return v def shape(x): return x.get_shape() def square(x): return tf.square(x) def zeros(shape, dtype=_FLOATX, name=None): return variable(np.zeros(shape), dtype, name) def concatenate(tensors, axis=-1): if axis < 0: axis = axis % len(tensors[0].get_shape()) return tf.concat(axis, tensors) def LRN2D(x): return tf.nn.lrn(x, alpha=1e-4, beta=0.75) def conv2d_bn(x, layer=None, cv1_out=None, cv1_filter=(1, 1), cv1_strides=(1, 1), cv2_out=None, cv2_filter=(3, 3), cv2_strides=(1, 1), padding=None): num = '' if cv2_out == None else '1' tensor = Conv2D(cv1_out, cv1_filter, strides=cv1_strides, data_format='channels_first', name=layer+'_conv'+num)(x) tensor = BatchNormalization(axis=1, epsilon=0.00001, name=layer+'_bn'+num)(tensor) tensor = Activation('relu')(tensor) if padding == None: return tensor tensor = ZeroPadding2D(padding=padding, data_format='channels_first')(tensor) if cv2_out == None: return tensor tensor = Conv2D(cv2_out, cv2_filter, strides=cv2_strides, data_format='channels_first', name=layer+'_conv'+'2')(tensor) tensor = BatchNormalization(axis=1, epsilon=0.00001, name=layer+'_bn'+'2')(tensor) tensor = Activation('relu')(tensor) return tensor WEIGHTS = [ 'conv1', 'bn1', 'conv2', 'bn2', 'conv3', 'bn3', 'inception_3a_1x1_conv', 'inception_3a_1x1_bn', 'inception_3a_pool_conv', 'inception_3a_pool_bn', 'inception_3a_5x5_conv1', 'inception_3a_5x5_conv2', 'inception_3a_5x5_bn1', 'inception_3a_5x5_bn2', 'inception_3a_3x3_conv1', 'inception_3a_3x3_conv2', 'inception_3a_3x3_bn1', 'inception_3a_3x3_bn2', 'inception_3b_3x3_conv1', 'inception_3b_3x3_conv2', 'inception_3b_3x3_bn1', 'inception_3b_3x3_bn2', 'inception_3b_5x5_conv1', 'inception_3b_5x5_conv2', 'inception_3b_5x5_bn1', 'inception_3b_5x5_bn2', 'inception_3b_pool_conv', 'inception_3b_pool_bn', 'inception_3b_1x1_conv', 'inception_3b_1x1_bn', 'inception_3c_3x3_conv1', 'inception_3c_3x3_conv2', 'inception_3c_3x3_bn1', 'inception_3c_3x3_bn2', 'inception_3c_5x5_conv1', 'inception_3c_5x5_conv2', 'inception_3c_5x5_bn1', 'inception_3c_5x5_bn2', 'inception_4a_3x3_conv1', 'inception_4a_3x3_conv2', 'inception_4a_3x3_bn1', 'inception_4a_3x3_bn2', 'inception_4a_5x5_conv1', 'inception_4a_5x5_conv2', 'inception_4a_5x5_bn1', 'inception_4a_5x5_bn2', 'inception_4a_pool_conv', 'inception_4a_pool_bn', 'inception_4a_1x1_conv', 'inception_4a_1x1_bn', 'inception_4e_3x3_conv1', 'inception_4e_3x3_conv2', 'inception_4e_3x3_bn1', 'inception_4e_3x3_bn2', 'inception_4e_5x5_conv1', 'inception_4e_5x5_conv2', 'inception_4e_5x5_bn1', 'inception_4e_5x5_bn2', 'inception_5a_3x3_conv1', 'inception_5a_3x3_conv2', 'inception_5a_3x3_bn1', 'inception_5a_3x3_bn2', 'inception_5a_pool_conv', 'inception_5a_pool_bn', 'inception_5a_1x1_conv', 'inception_5a_1x1_bn', 'inception_5b_3x3_conv1', 'inception_5b_3x3_conv2', 'inception_5b_3x3_bn1', 'inception_5b_3x3_bn2', 'inception_5b_pool_conv', 'inception_5b_pool_bn', 'inception_5b_1x1_conv', 'inception_5b_1x1_bn', 'dense_layer' ] conv_shape = { 'conv1': [64, 3, 7, 7], 'conv2': [64, 64, 1, 1], 'conv3': [192, 64, 3, 3], 'inception_3a_1x1_conv': [64, 192, 1, 1], 'inception_3a_pool_conv': [32, 192, 1, 1], 'inception_3a_5x5_conv1': [16, 192, 1, 1], 'inception_3a_5x5_conv2': [32, 16, 5, 5], 'inception_3a_3x3_conv1': [96, 192, 1, 1], 'inception_3a_3x3_conv2': [128, 96, 3, 3], 'inception_3b_3x3_conv1': [96, 256, 1, 1], 'inception_3b_3x3_conv2': [128, 96, 3, 3], 'inception_3b_5x5_conv1': [32, 256, 1, 1], 'inception_3b_5x5_conv2': [64, 32, 5, 5], 'inception_3b_pool_conv': [64, 256, 1, 1], 'inception_3b_1x1_conv': [64, 256, 1, 1], 'inception_3c_3x3_conv1': [128, 320, 1, 1], 'inception_3c_3x3_conv2': [256, 128, 3, 3], 'inception_3c_5x5_conv1': [32, 320, 1, 1], 'inception_3c_5x5_conv2': [64, 32, 5, 5], 'inception_4a_3x3_conv1': [96, 640, 1, 1], 'inception_4a_3x3_conv2': [192, 96, 3, 3], 'inception_4a_5x5_conv1': [32, 640, 1, 1,], 'inception_4a_5x5_conv2': [64, 32, 5, 5], 'inception_4a_pool_conv': [128, 640, 1, 1], 'inception_4a_1x1_conv': [256, 640, 1, 1], 'inception_4e_3x3_conv1': [160, 640, 1, 1], 'inception_4e_3x3_conv2': [256, 160, 3, 3], 'inception_4e_5x5_conv1': [64, 640, 1, 1], 'inception_4e_5x5_conv2': [128, 64, 5, 5], 'inception_5a_3x3_conv1': [96, 1024, 1, 1], 'inception_5a_3x3_conv2': [384, 96, 3, 3], 'inception_5a_pool_conv': [96, 1024, 1, 1], 'inception_5a_1x1_conv': [256, 1024, 1, 1], 'inception_5b_3x3_conv1': [96, 736, 1, 1], 'inception_5b_3x3_conv2': [384, 96, 3, 3], 'inception_5b_pool_conv': [96, 736, 1, 1], 'inception_5b_1x1_conv': [256, 736, 1, 1], } def load_weights_from_FaceNet(FRmodel): # Load weights from csv files (which was exported from Openface torch model) weights = WEIGHTS weights_dict = load_weights() # Set layer weights of the model for name in weights: if FRmodel.get_layer(name) != None: FRmodel.get_layer(name).set_weights(weights_dict[name]) elif model.get_layer(name) != None: model.get_layer(name).set_weights(weights_dict[name]) def load_weights(): # Set weights path dirPath = './weights' fileNames = filter(lambda f: not f.startswith('.'), os.listdir(dirPath)) paths = {} weights_dict = {} for n in fileNames: paths[n.replace('.csv', '')] = dirPath + '/' + n for name in WEIGHTS: if 'conv' in name: conv_w = genfromtxt(paths[name + '_w'], delimiter=',', dtype=None) conv_w = np.reshape(conv_w, conv_shape[name]) conv_w = np.transpose(conv_w, (2, 3, 1, 0)) conv_b = genfromtxt(paths[name + '_b'], delimiter=',', dtype=None) weights_dict[name] = [conv_w, conv_b] elif 'bn' in name: bn_w = genfromtxt(paths[name + '_w'], delimiter=',', dtype=None) bn_b = genfromtxt(paths[name + '_b'], delimiter=',', dtype=None) bn_m = genfromtxt(paths[name + '_m'], delimiter=',', dtype=None) bn_v = genfromtxt(paths[name + '_v'], delimiter=',', dtype=None) weights_dict[name] = [bn_w, bn_b, bn_m, bn_v] elif 'dense' in name: dense_w = genfromtxt(dirPath+'/dense_w.csv', delimiter=',', dtype=None) dense_w = np.reshape(dense_w, (128, 736)) dense_w = np.transpose(dense_w, (1, 0)) dense_b = genfromtxt(dirPath+'/dense_b.csv', delimiter=',', dtype=None) weights_dict[name] = [dense_w, dense_b] return weights_dict def load_dataset(): train_dataset = h5py.File('datasets/train_happy.h5', "r") train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels test_dataset = h5py.File('datasets/test_happy.h5', "r") test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels classes = np.array(test_dataset["list_classes"][:]) # the list of classes train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0])) test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0])) return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes def img_to_encoding(image_path, model): img1 = cv2.imread(image_path, 1) img = img1[...,::-1] img = np.around(np.transpose(img, (2,0,1))/255.0, decimals=12) x_train = np.array([img]) embedding = model.predict_on_batch(x_train) return embedding ```

YOLO在本机运行时候报错

问题描述: InternalError (see above for traceback): Blas SGEMM launch failed : m=81920, n=32, k=64 [[node conv2d_3/convolution (defined at j:\ProgramData\Anaconda3\envs\tensorflow\lib\site-packages\keras\backend\tensorflow_backend.py:3650) = Conv2D[T=DT_FLOAT, _class=["loc:@batch_normalization_3/cond/FusedBatchNorm/Switch"], data_format="NHWC", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](leaky_re_lu_2/LeakyRelu, conv2d_3/kernel/read)]] [[{{node concat_11/_2897}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_3860_concat_11", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]] 用了https://blog.csdn.net/gsww404/article/details/80507704上所说的方法还是无法解决该问题,请问还有可能是什么运行吗?

技术大佬:我去,你写的 switch 语句也太老土了吧

昨天早上通过远程的方式 review 了两名新来同事的代码,大部分代码都写得很漂亮,严谨的同时注释也很到位,这令我非常满意。但当我看到他们当中有一个人写的 switch 语句时,还是忍不住破口大骂:“我擦,小王,你丫写的 switch 语句也太老土了吧!” 来看看小王写的代码吧,看完不要骂我装逼啊。 private static String createPlayer(PlayerTypes p...

副业收入是我做程序媛的3倍,工作外的B面人生是怎样的?

提到“程序员”,多数人脑海里首先想到的大约是:为人木讷、薪水超高、工作枯燥…… 然而,当离开工作岗位,撕去层层标签,脱下“程序员”这身外套,有的人生动又有趣,马上展现出了完全不同的A/B面人生! 不论是简单的爱好,还是正经的副业,他们都干得同样出色。偶尔,还能和程序员的特质结合,产生奇妙的“化学反应”。 @Charlotte:平日素颜示人,周末美妆博主 大家都以为程序媛也个个不修边幅,但我们也许...

我说我不会算法,阿里把我挂了。

不说了,字节跳动也反手把我挂了。

抖音上很火的时钟效果

反正,我的抖音没人看,别人都有几十万个赞什么的。 发到CSDN上来,大家交流下~ 主要用到原生态的 JS+CSS3。 具体不解释了,看注释: &lt;!DOCTYPE html&gt; &lt;html lang="en"&gt; &lt;head&gt; &lt;meta charset="UTF-8"&gt; &lt;title&gt;Title&lt;/tit...

记录下入职中软一个月(外包华为)

我在年前从上一家公司离职,没想到过年期间疫情爆发,我也被困在家里,在家呆着的日子让人很焦躁,于是我疯狂的投简历,看面试题,希望可以进大公司去看看。 我也有幸面试了我觉得还挺大的公司的(虽然不是bat之类的大厂,但是作为一名二本计算机专业刚毕业的大学生bat那些大厂我连投简历的勇气都没有),最后选择了中软,我知道这是一家外包公司,待遇各方面甚至不如我的上一家公司,但是对我而言这可是外包华为,能...

培训班出来的人后来都怎么样了?(二)

接着上回说,培训班学习生涯结束了。后面每天就是无休止的背面试题,不是没有头脑的背,培训公司还是有方法的,现在回想当时背的面试题好像都用上了,也被问到了。回头找找面试题,当时都是打印下来天天看,天天背。 不理解呢也要背,面试造飞机,上班拧螺丝。班里的同学开始四处投简历面试了,很快就有面试成功的,刚开始一个,然后越来越多。不知道是什么原因,尝到胜利果实的童鞋,不满足于自己通过的公司,嫌薪水要少了,选择...

面试了一个 31 岁程序员,让我有所触动,30岁以上的程序员该何去何从?

最近面试了一个31岁8年经验的程序猿,让我有点感慨,大龄程序猿该何去何从。

大三实习生,字节跳动面经分享,已拿Offer

说实话,自己的算法,我一个不会,太难了吧

程序员垃圾简历长什么样?

已经连续五年参加大厂校招、社招的技术面试工作,简历看的不下于万份 这篇文章会用实例告诉你,什么是差的程序员简历! 疫情快要结束了,各个公司也都开始春招了,作为即将红遍大江南北的新晋UP主,那当然要为小伙伴们做点事(手动狗头)。 就在公众号里公开征简历,义务帮大家看,并一一点评。《启舰:春招在即,义务帮大家看看简历吧》 一石激起千层浪,三天收到两百多封简历。 花光了两个星期的所有空闲时...

推荐9个能让你看一天的网站

分享的这9个保证另你意外的网站,每个都非常实用!非常干货!毫不客气的说,这些网站最少值10万块钱。 利用好这些网站,会让你各方面的技能都得到成长,不说让你走上人生巅峰,但对比现在的你,在眼界、学识、技能方面都有质的飞跃。 一、AIRPANO 传送门:https://www.airpano.com/360photo_list.php 这是一个可以躺在家里,就能环游世界的神奇网站。 世界那么大,绝大多...

大牛都会用的IDEA调试技巧!!!

导读 前天面试了一个985高校的实习生,问了他平时用什么开发工具,他想也没想的说IDEA,于是我抛砖引玉的问了一下IDEA的调试用过吧,你说说怎么设置断点...

都前后端分离了,咱就别做页面跳转了!统统 JSON 交互

文章目录1. 无状态登录1.1 什么是有状态1.2 什么是无状态1.3 如何实现无状态1.4 各自优缺点2. 登录交互2.1 前后端分离的数据交互2.2 登录成功2.3 登录失败3. 未认证处理方案4. 注销登录 这是本系列的第四篇,有小伙伴找不到之前文章,松哥给大家列一个索引出来: 挖一个大坑,Spring Security 开搞! 松哥手把手带你入门 Spring Security,别再问密...

97年世界黑客编程大赛冠军作品(大小仅为16KB),惊艳世界的编程巨作

这是世界编程大赛第一名作品(97年Mekka ’97 4K Intro比赛)汇编语言所写。 整个文件只有4095个字节, 大小仅仅为16KB! 不仅实现了3D动画的效果!还有一段震撼人心的背景音乐!!! 内容无法以言语形容,实在太强大! 下面是代码,具体操作看最后! @echo off more +1 %~s0|debug e100 33 f6 bf 0 20 b5 10 f3 a5...

不要再到处使用 === 了

我们知道现在的开发人员都使用 === 来代替 ==,为什么呢?我在网上看到的大多数教程都认为,要预测 JavaScript 强制转换是如何工作这太复杂了,因此建议总是使用===。这些都...

什么是a站、b站、c站、d站、e站、f站、g站、h站、i站、j站、k站、l站、m站、n站?00后的世界我不懂!

A站 AcFun弹幕视频网,简称“A站”,成立于2007年6月,取意于Anime Comic Fun,是中国大陆第一家弹幕视频网站。A站以视频为载体,逐步发展出基于原生内容二次创作的完整生态,拥有高质量互动弹幕,是中国弹幕文化的发源地;拥有大量超粘性的用户群体,产生输出了金坷垃、鬼畜全明星、我的滑板鞋、小苹果等大量网络流行文化,也是中国二次元文化的发源地。 B站 全称“哔哩哔哩(bilibili...

十个摸鱼,哦,不对,是炫酷(可以玩一整天)的网站!!!

文章目录前言正文**1、Kaspersky Cyberthreat real-time map****2、Finding Home****3、Silk – Interactive Generative Art****4、Liquid Particles 3D****5、WINDOWS93****6、Staggering Beauty****7、Ostagram图片生成器网址****8、全历史网址*...

终于,月薪过5万了!

来看几个问题想不想月薪超过5万?想不想进入公司架构组?想不想成为项目组的负责人?想不想成为spring的高手,超越99%的对手?那么本文内容是你必须要掌握的。本文主要详解bean的生命...

大厂的 404 页面都长啥样?最后一个笑了...

每天浏览各大网站,难免会碰到404页面啊。你注意过404页面么?猿妹搜罗来了下面这些知名网站的404页面,以供大家欣赏,看看哪个网站更有创意: 正在上传…重新上传取消 腾讯 正在上传…重新上传取消 网易 淘宝 百度 新浪微博 正在上传…重新上传取消 新浪 京东 优酷 腾讯视频 搜...

自从喜欢上了B站这12个UP主,我越来越觉得自己是个废柴了!

不怕告诉你,我自从喜欢上了这12个UP主,哔哩哔哩成为了我手机上最耗电的软件,几乎每天都会看,可是吧,看的越多,我就越觉得自己是个废柴,唉,老天不公啊,不信你看看…… 间接性踌躇满志,持续性混吃等死,都是因为你们……但是,自己的学习力在慢慢变强,这是不容忽视的,推荐给你们! 都说B站是个宝,可是有人不会挖啊,没事,今天咱挖好的送你一箩筐,首先啊,我在B站上最喜欢看这个家伙的视频了,为啥 ,咱撇...

代码注释如此沙雕,会玩还是你们程序员!

某站后端代码被“开源”,同时刷遍全网的,还有代码里的那些神注释。 我们这才知道,原来程序员个个都是段子手;这么多年来,我们也走过了他们的无数套路… 首先,产品经理,是永远永远吐槽不完的!网友的评论也非常扎心,说看这些代码就像在阅读程序员的日记,每一页都写满了对产品经理的恨。 然后,也要发出直击灵魂的质问:你是尊贵的付费大会员吗? 这不禁让人想起之前某音乐app的穷逼Vip,果然,穷逼在哪里都是...

一场疫情,炸出了退休的COBOL程序员

COBOL编程语言,估计大多数程序员从没听说过,我这样的编程老司机,也是只闻其名,从未一睹芳容。出门问了问度娘,答案如下:COBOL语言,是一种面向过程的高级程序设计语言,主要用于数据...

爬虫(101)爬点重口味的

小弟最近在学校无聊的很哪,浏览网页突然看到一张图片,都快流鼻血。。。然后小弟冥思苦想,得干一点有趣的事情python 爬虫库安装https://s.taobao.com/api?_ks...

讲真,这两款idea插件,能治愈你英语不好的病

时不时就有小伙伴问我,“二哥,能推荐一款 IDE 吗?”你看这话问的,现在搞 Java 的不都在用 Intellij IDEA 吗,还用得着推荐(我已经和 Eclipse 分手了)。然后小伙伴又说,“二哥,IDEA 支持中文吗?我英语不太好。”你看这话问的,搞编程的,英语不好是硬伤啊! 不过,随着 IDEA 最新版(版本号是 2020.1)的发布,英语不好的病可以彻底治愈了。为什么这么说呢?因为 ...

在拼多多上班,是一种什么样的体验?我心态崩了呀!

之前有很多读者咨询我:武哥,在拼多多上班是一种什么样的体验?由于一直很忙,没抽出时间来和大家分享。上周末特地花点时间来写了一篇文章,跟大家分享一下拼多多的日常。 1. 倒时差的作息 可能很多小伙伴都听说了,拼多多加班很严重。这怎么说呢?作息上确实和其他公司有点区别,大家知道 996,那么自然也就能理解拼多多的“11 11 6”了。 所以当很多小伙伴早上出门时,他们是这样的: 我们是这样的: 当...

又一起程序员被抓事件

就在昨天互联网又发生一起让人心酸的程序员犯罪事件,著名的百度不限速下载软件 Pandownload PC 版作者被警方抓获。案件大致是这样的:软件的作者不仅非法盗取用户数据,还在QQ群进...

瑞德西韦重症用药结果再曝光,上百名重症一周内好转,股价大涨19%

郭一璞 发自 凹非寺量子位 报道 | 公众号 QbitAI期盼已久的瑞德西韦临床数据,现在“偷跑”了。在芝加哥大学医学院的临床试验中,125名病人参与,大部分人都已经出院,其中只有2名病...

应聘3万的职位,有必要这么刁难我么。。。沙雕。。。

又一次被面试官带到坑里面了。面试官:springmvc用过么?我:用过啊,经常用呢面试官:springmvc中为什么需要用父子容器?我:嗯。。。没听明白你说的什么。面试官:就是contr...

Vue商城——详情页功能

详情页实现思路 点击商品进去详情页,根据点击请求更加详细的信息,要传过来goodsItem的iid,根据id去服务器请求更加详细的信息;配置路由映射关系,点击进行跳转,带参数传递跳转 itemClick(){ this.$router.push('/detail/'+this.goodsItem.iid) /* this.$router.push({ ...

太狠了,疫情期间面试,一个问题砍了我5000!

疫情期间找工作确实有点难度,想拿到满意的薪资,确实要点实力啊!面试官:Spring中的@Value用过么,介绍一下我:@Value可以标注在字段上面,可以将外部配置文件中的数据,比如可以...

自学编程的 6 个致命误区

嗨,小伙伴们大家好,我是沉默王二。本篇文章来和大家聊聊自学编程中的一些误区——这是我在 B 站上看了羊哥的一期视频后有感而发的文章。因为确实有很多读者也曾私信问过我这些方面的问题,很有代表性,所以我就结合自己的亲身体会来谈一谈,希望对小伙伴们有所启发。 01、追求时髦 所谓基础不牢,地动山摇啊。可很多小伙伴压根就没注意过这个问题,市面上出什么新鲜的技术就想去尝试,结果把自己学的乱七八糟,心灰意冷...

相关热词 c#树形选择 c#中类图的使用方法 c# 传参 调用exe c# 怎么定义方法 c# 修改本地时间 c#前台怎么读取资源文件 c# xml转list c#实现框选截图 m*m乘法表c# c# 乘法99表
立即提问