tensorflow加入Dropout后loss计算不准确?是不是踩坑了啊,被整蒙圈了啊。如下文,我在全连接里加了一层layers.Dropout, 经过40次Epoch后,loss已经收敛得很小。但是此时无论是predict、还是直接call出来的结果都还是很真实值相差挺多,相差比例1-dropout_rate有那么一点相近,这个按理是不应该的啊,tf默认是inverted dropout啊传播时自动做了scale。只能怀疑时loss计算不准确了?
运行结果及报错内容
>>> model0 = tf.keras.Sequential()
>>> model0.add(layers.Dense(
... 32,
... activation='relu',
... kernel_initializer=initializers_v2.HeUniform()))
>>> model0.add(layers.Dense(
... 512,
... activation='relu',
... kernel_initializer=initializers_v2.HeUniform()))
>>> model0.add(layers.Dropout(0.5))
>>> model0.add(layers.Dense(
... 512,
... activation='relu',
... kernel_initializer=initializers_v2.HeUniform()))
>>> model0.add(layers.Dense(
... 1,
... kernel_initializer=initializers_v2.HeUniform()))
>>> model0.compile(
... optimizer=tf.keras.optimizers.Adam(learning_rate=0.0005),
... loss=tf.keras.losses.MeanAbsoluteError(),
... metrics=['mae'])
>>> model0.fit(
... X_a, y_value_a,
... epochs=40)
Epoch 40/40
625/625 [==============================] - 1s 2ms/step - loss: 0.2190 - mae: 0.2190
<keras.callbacks.History object at 0x000002E5CA4DB700>
预测值/真实值 相差比例跟1-dropout_rate有那么一点相近
>>> y_pred_a = model0.predict(X_a)
625/625 [==============================] - 1s 826us/step
>>> y_call_a = model0.call(tf.convert_to_tensor(X_a))
>>> print(y_pred_a)
[[-13.342887 ]
[ -5.501849 ]
[ -6.9179006]
...
[ 19.558184 ]
[ 1.6202304]
[ -6.4136333]]
>>> print(y_call_a)
tf.Tensor(
[[-13.342106 ]
[ -5.502385 ]
[ -6.9180675]
...
[ 19.557985 ]
[ 1.6198621]
[ -6.413244 ]], shape=(20000, 1), dtype=float32)
>>> print(y_value_a)
[[-26.56738428]
[-10.90707993]
[-13.71867984]
...
[ 38.34531367]
[ 3.16246066]
[-12.71669086]]
这里手动计算mae、调用model的loss\evaluate计算的loss相同,但是跟metrics里的loss相差很多,更奇怪的是当evaluate完之后,再次metrics得到loss又变了,是不是踩到大坑了啊
```python
>>> model0.metrics[0].result().numpy()
0.21902849
>>> loss_a = y_value_a - y_pred_a
>>> abs(loss_a).mean()
7.838831122853866
>>> model0.metrics[0].result().numpy()
0.21902849
>>> model0.loss(y_value_a, y_pred_a)
<tf.Tensor: shape=(), dtype=float32, numpy=7.8388314>
>>> model0.metrics[0].result().numpy()
0.21902849
>>> model0.evaluate(X_a,y_value_a)
625/625 [==============================] - 1s 961us/step - loss: 7.8388 - mae: 7.8388
[7.838833808898926, 7.838833808898926]
>>> model0.metrics[0].result().numpy()
7.838834
```