JayNicht
2021-07-27 15:22
采纳率: 50%
浏览 54

针对神经网络模型,二阶偏导始终为0

因为课题需求,需要求神经网络的二阶偏导,但是经过数次尝试,发现无论怎么实现,神经网络的二阶偏导结果都为0,以为对神经网络的理论知识掌握不是特别扎实,所以无法确定原因,但是猜想是不是神经网络较难拟合高次幂函数呢。

实现的目的如下:
f(x,y)=x²+y
用神经网络拟合f(x,y),即model(x,y)=f(x,y)
求model"xx,即∂²model/∂x²

使用的模型如下

model = keras.Sequential([
    keras.layers.Dense(100, activation='relu'),
    keras.layers.Dense(100, activation='relu'),
    keras.layers.Dense(100, activation='relu'),
    keras.layers.Dense(100, activation='relu'),
    keras.layers.Dense(100, activation='relu'),
    keras.layers.Dense(1),
])

模型已完成学习,随后用自动求导求偏分

with tf.GradientTape(persistent=True) as tape3:
    tape3.watch(X)
    tape3.watch(Y)
    with tf.GradientTape(persistent=True) as tape4:
        tape4.watch(X)
        tape4.watch(Y)
        Z = tf.concat([X, Y], 1)
        ff = model(Z)
        dy = tape4.gradient(ff, Y)
        dx = tape4.gradient(ff, X)
        print(tf.concat([dx, dy], 1))
    dxdx = tape3.gradient(dx, X)
    print(dxdx)

dx和dy都基本正确,但dxdx全部为0,按理说dxdx应该为2左右。找不到问题的原因,是我哪里写的不对吗,还是说,神经网络的拟合,是只针对x和y进行一次线性拟合,所以二阶导均为0...应该不是吧……

import tensorflow as tf
import numpy as np
from tensorflow import keras
import sys
import random

np.set_printoptions(threshold=np.inf)
np.set_printoptions(suppress=True)

x = np.arange(0, 101, dtype=float)
y = np.arange(0, 101, dtype=float)
for i in range(101):
    x[i]=np.round(random.random()*10,2)
    y[i]=np.round(random.random()*10,2)
x=x.reshape(101,1)
y=y.reshape(101,1)

z = np.arange(1, 203, dtype=float).reshape(101, 2)
lis = np.arange(2, dtype=float)
for i in range(101):
    lis[0]=float(x[i])
    lis[1]=float(y[i])
    z[i] = lis

ans=(x*x+y).reshape(101,1)

X=tf.convert_to_tensor(x, dtype=float)
Y=tf.convert_to_tensor(y, dtype=float)


model = keras.Sequential([
    keras.layers.Dense(100, activation='relu'),
    keras.layers.Dense(100, activation='relu'),
    keras.layers.Dense(100, activation='relu'),
    keras.layers.Dense(100, activation='relu'),
    keras.layers.Dense(100, activation='relu'),
    keras.layers.Dense(1),
])

optimizer = tf.keras.optimizers.Adam(learning_rate=0.001, epsilon=1e-07)


for i in range(2000):
    with tf.GradientTape(persistent=True) as tape:
        tape.watch(X)
        tape.watch(Y)
        with tf.GradientTape(persistent=True) as tape2:
            tape2.watch(X)
            tape2.watch(Y)
            Z = tf.concat([X, Y], 1)
            f=model(Z)
            loss = tf.reduce_mean(tf.square(ans - f))
            grads = tape2.gradient(loss, model.variables)
            dx = tape2.gradient(f, X)
            dy = tape2.gradient(f, Y)
        dxdx=tape.gradient(dx, X)
        #print(dxdx)
    optimizer.apply_gradients(grads_and_vars=zip(grads, model.variables))

    if i%10==0:
        print(i, loss)


with tf.GradientTape(persistent=True) as tape3:
    tape3.watch(X)
    tape3.watch(Y)
    with tf.GradientTape(persistent=True) as tape4:
        tape4.watch(X)
        tape4.watch(Y)
        Z = tf.concat([X, Y], 1)
        ff = model(Z)
        dy = tape4.gradient(ff, Y)
        dx = tape4.gradient(ff, X)
        print(tf.concat([dx, dy], 1))
    dxdx = tape3.gradient(dx, X)
    print(dxdx)
  • 好问题 提建议
  • 收藏

1条回答 默认 最新

  • 爱晚乏客游 2021-07-28 09:24

    你模型的输入输出呢?输入是什么,输出又是什么?

    评论
    解决 无用
    打赏 举报

相关推荐 更多相似问题