the_little_boy 2020-10-31 19:58 采纳率: 0%
浏览 225

SE注意力机制问题,在Squeeze操作.GlobalAveragePooling2D之后维度发生改变,导致输入下面的卷积出现问题,怎么解决呢

x = keras.layers.Conv2D(64,(3,3),padding='same')(input_tensor)

in_channels = K.int_shape(x)[-1]
###注意力机制
in_tensor = x

#Squeeze操作
x = layers.GlobalAveragePooling2D()(x)

ratio1 = 0.5

#Exicitation
out = Conv2D(filters=in_channels//ratio1, kernel_size=(1,1))(x)
out = Activation("relu")(out)
out = Conv2D(filters=in_channels, kernel_size=(1,1))(out)
out = Activation('sigmoid')(out)
out = layers.Reshape((1, in_channels))(out)

scale = tf.multiply(in_tensor, out)

  • 写回答

1条回答 默认 最新

  • 脑洞笔记 2024-05-27 17:30
    关注

    在您提到的SE(Squeeze-and-Excitation)注意力机制中,GlobalAveragePooling2D操作确实会改变张量的维度。GlobalAveragePooling2D将每个特征图的维度从(height, width, channels)压缩到(channels,),这就是所谓的Squeeze操作。但是,在Excitation步骤之后,您需要将输出张量的维度调整回原来的形状,以便与原始输入in_tensor进行元素乘法(tf.multiply)操作。

    问题在于,out = layers.Reshape((1, in_channels))(out)这一行尝试重新塑形out,但是给出的新形状(1, in_channels)in_tensor的形状不匹配。正确的做法是将out的形状调整为与in_tensor相同的空间维度(height, width),同时保留通道数in_channels

    解决方案是在Excitation步骤之后,使用Reshape将输出张量的形状调整为(batch_size, height, width, channels),这样就可以与in_tensor进行元素乘法操作了。由于GlobalAveragePooling2D操作之后,我们失去了heightwidth的信息,因此需要从in_tensor中获取这些信息。

    修改后的代码如下:

    from tensorflow.keras import layers, models
    import tensorflow as tf
    import keras.backend as K
    from tensorflow.keras.layers import Conv2D, Activation
    
    input_tensor = layers.Input(shape=(224, 224, 3))
    x = layers.Conv2D(64, (3, 3), padding='same')(input_tensor)
    
    in_channels = K.int_shape(x)[-1]
    # 注意力机制
    in_tensor = x
    
    # Squeeze操作
    x = layers.GlobalAveragePooling2D()(x)
    x = layers.Reshape((1, 1, in_channels))(x)  # 重塑形状以匹配卷积操作的需求
    
    ratio1 = 0.5
    
    # Excitation
    out = Conv2D(filters=int(in_channels//ratio1), kernel_size=(1, 1))(x)
    out = Activation("relu")(out)
    out = Conv2D(filters=in_channels, kernel_size=(1, 1))(out)
    out = Activation('sigmoid')(out)
    
    # 获取原始输入的空间维度
    height, width = K.int_shape(in_tensor)[1], K.int_shape(in_tensor)[2]
    # 重塑out以匹配in_tensor的形状
    out = layers.Reshape((height, width, in_channels))(out)
    
    # 执行缩放操作
    scale = tf.multiply(in_tensor, out)
    

    这里使用Reshape层将out的形状调整为与in_tensor相同的空间维度,从而使得两者可以进行逐元素的乘法操作。这样就解决了由于维度不匹配而导致的问题。

    评论

报告相同问题?