本人六年级小学生,很想进军人工智能领域,于是搞了本《Python神经网络编程》来看(图1),安装搭建了jupyter,可是在训练神经网络(更新权重那一步)的时候出了问题,也不只是怎么回事,打了ipdb断点,一查,阵矩大小、形状都没有问题,可是jupyter一直报错矩阵无法相乘,求大神解答(更多信息见正文)
先贴一下错误信息:
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 3 is different from 5)
我百度了一下,这个好像是阵矩1的行与阵矩2的列不相同,我再次断点调试,各种一番捯饬,最后……额还是没找出问题……
再贴一下诡异的jupyter运行截图:
图中可以看到框1是将要执行的语句,但是上一句打了ipdb断点,之后我在断点环境下执行了代码,结果成功了!!!成功了!!!可是我按下c之后就又报错了!!!求教啊,神马蛇皮走位?于是我再把断点去掉,又是同样的报错,放到导出py文件后本地执行也报错,神马情况???
再贴一下train函数截图:
最后附上全部代码,方便大家解答:
import numpy
import scipy.special
# 神经网络
class Network:
"""神经网络类"""
def __init__(self, nodes, l_rate):
"""神经网络初始化方法"""
self.nodes = nodes
self.lr = l_rate
self.layer_num = len(nodes)
self.weight_num = self.layer_num -1
self._init_layers()
self._init_weights()
self.activ = scipy.special.expit
def _init_layers(self):
"""初始化所有神经层"""
self.layers = [None for _ in range(self.layer_num)]
for i in range(self.layer_num):
# 本层的神经数
node_num = self.nodes[i]
# 初始化本层神经,用0填充
self.layers[i] = numpy.zeros((node_num,)).T
def _init_weights(self):
"""初始化所有权重"""
self.weights = [None for _ in range(self.weight_num)]
for i in range(self.weight_num):
# 初始化本层权重,用[1/三√下一层神经数)]的正态分布填充
# 参数:1.正态分布中心 2.[1/(√下一层神经数)] 3.权重阵矩大小
self.weights[i] = numpy.random.normal(0.0, pow(self.nodes[i+1], -0.5), (self.nodes[i+1], self.nodes[i]))
def query(self, inputs):
"""查询神经网络的结果"""
assert len(inputs) is self.nodes[0]
for i in range(self.layer_num):
# 如果第是一遍遍历,将最后一次结果设为输入值
if i is 0:
self.layers[0] = numpy.array(inputs, ndmin=2).T
continue
# 本轮神经计算值
self.layers[i] = self.weights[i - 1] @ self.layers[i - 1]
self.layers[i] = self.activ(self.layers[i])
return self.layers[-1]
def train(self, inputs, targets):
"""对神经网络进行训练"""
assert len(targets) is self.nodes[-1]
last_result = self.query(inputs)
for i in reversed(range(self.weight_num)):
if i is self.weight_num-1:
errors = numpy.array(targets, ndmin=2).T - last_result
else:
errors = self.weights[i + 1] @ errors
import ipdb; ipdb.set_trace()
# 问题出在这一行↓↓↓
self.weights[i] += self.lr *((errors * self.layers[i+1] * (1.0 - self.layers[i+1])) @ self.layers[i].T)
if __name__ == '__main__':
nw = Network((3, 5, 5, 3), 0.01)
for _ in range(10000):
nw.train([1, 1, 1], [1, 1, 1])
results = nw.query([1, 1, 1])
print(results)