用keras做图像2分类,label非平衡,约1:10,代码如下:
data = np.load('D:/a.npz')
image_data, label_data= data['image'], data['label']
由于数据不平衡,用分层K折拆分为3组,
train_x=image_data[train]
test_x=image_data[test]
train_y=label_data[train]
test_y=label_data[test]
train_x = np.array(train_x)
test_x = np.array(test_x)
train_x = train_x.reshape(train_x.shape[0],1,28,28)
test_x = test_x.reshape(test_x.shape[0],1,28,28)
train_x = train_x.astype('float32')
test_x = test_x.astype('float32')
train_x /=255
test_x /=255
train_y = np.array(train_y)
test_y = np.array(test_y)
然后用keras的序贯模型
model.compile(optimizer='rmsprop',loss="binary_crossentropy",metrics=['acc'])
model.fit(train_x, train_y,batch_size=128, class_weight = 'auto', epochs=10,verbose=1,validation_data=(test_x, test_y))
from sklearn.metrics import confusion_matrix
y_pred_model = model.predict_proba(test_x)
C=confusion_matrix(test_y,y_pred_model)
print(C)
结果总是所有test样本归为一类,
推测可能是不平衡,模型认为最优化就是将所有样本都认作为较大类,但是将2分类label改为1:1后,结果仍然是所有test都归为一类:
[[22 0]
[21 0]]
请教这是啥原因?代码错在哪?