小月饼呐呐 2023-09-08 16:04 采纳率: 0%
浏览 19
已结题

tensorflow框架下的Transformer

我使用Transflow进行多元时间序列预测:使用过去三步的历史数据去预测下一步的demand。一直到编译模型没有问题,但输入存在问题,是我的输入构造错了吗?还是模型哪里有问题?

数据

img

划分X和Y

img

定义超参数

num_heads = 4
num_encoder_layers = 2
num_decoder_layers = 2
d_model = 12
dff = 12
input_sequence_length = 3  # 输入时间步数
output_sequence_length = 1  # 输出时间步数
batch_size = 32
num_epochs = 50

模型

def transformer_encoder(inputs, d_model, num_heads, dff, num_layers, dropout_rate=0.1):
    attention = layers.MultiHeadAttention(num_heads=num_heads, key_dim=d_model)
    outputs = inputs
    for _ in range(num_layers):
        # Multi-Head Self-Attention
        attention_out = attention(outputs, outputs)
        attention_out = layers.Dropout(dropout_rate)(attention_out)
        # Residual Connection
        outputs = layers.Add()([outputs, attention_out])
        # Layer Normalization
        outputs = layers.LayerNormalization(epsilon=1e-6)(outputs)
        # Feed Forward Network
        ffnn = keras.Sequential([
            layers.Dense(dff, activation='relu'),
            layers.Dense(d_model)
        ])
        ffnn_out = ffnn(outputs)
        ffnn_out = layers.Dropout(dropout_rate)(ffnn_out)
        # Residual Connection
        outputs = layers.Add()([outputs, ffnn_out])
        # Layer Normalization
        outputs = layers.LayerNormalization(epsilon=1e-6)(outputs)
    return outputs

def build_model(input_shape, output_sequence_length, num_heads, num_encoder_layers, num_decoder_layers, d_model, dff, dropout_rate=0.1):
    inputs = keras.Input(shape=input_shape)
    x = inputs
    for _ in range(num_encoder_layers):
        x = transformer_encoder(x, d_model, num_heads, dff, num_encoder_layers, dropout_rate)
    decoder_inputs = keras.Input(shape=(output_sequence_length, input_shape[-1]))
    x = decoder_inputs
    for _ in range(num_decoder_layers):
        x = transformer_encoder(x, d_model, num_heads, dff, num_decoder_layers, dropout_rate)
    outputs = layers.Dense(output_sequence_length)(x)
    return keras.Model(inputs=[inputs, decoder_inputs], outputs=outputs)

编译和训练


```python
input_shape = (3, 12)  # 请替换num_features为您的特征数
model = build_model(input_shape, output_sequence_length, num_heads, num_encoder_layers, num_decoder_layers, d_model, dff)

# 编译模型
model.compile(optimizer='adam', loss='mse', metrics=['mae'])

# 训练模型
# 请提供您的训练数据和标签
# x_train 和 y_train 的形状应该为 (样本数, 输入时间步数, 特征数) 和 (样本数, 输出时间步数, 特征数)
model.fit([trainX, trainX], trainY, batch_size=batch_size, epochs=num_epochs, validation_data=(valX, valY))
错误:

```python
ValueError                                Traceback (most recent call last)
<ipython-input-110-67eda37bdadd> in <module>
      2 # 请提供您的训练数据和标签
      3 # x_train 和 y_train 的形状应该为 (样本数, 输入时间步数, 特征数) 和 (样本数, 输出时间步数, 特征数)
----> 4 model.fit([trainX, trainX], trainY, batch_size=batch_size, epochs=num_epochs, validation_data=(valX, valY))

E:\anaconda\lib\site-packages\keras\utils\traceback_utils.py in error_handler(*args, **kwargs)
     65     except Exception as e:  # pylint: disable=broad-except
     66       filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67       raise e.with_traceback(filtered_tb) from None
     68     finally:
     69       del filtered_tb

E:\anaconda\lib\site-packages\tensorflow\python\framework\func_graph.py in autograph_handler(*args, **kwargs)
   1145           except Exception as e:  # pylint:disable=broad-except
   1146             if hasattr(e, "ag_error_metadata"):
-> 1147               raise e.ag_error_metadata.to_exception(e)
   1148             else:
   1149               raise

ValueError: in user code:

    File "E:\anaconda\lib\site-packages\keras\engine\training.py", line 1021, in train_function  *
        return step_function(self, iterator)
    File "E:\anaconda\lib\site-packages\keras\engine\training.py", line 1010, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "E:\anaconda\lib\site-packages\keras\engine\training.py", line 1000, in run_step  **
        outputs = model.train_step(data)
    File "E:\anaconda\lib\site-packages\keras\engine\training.py", line 859, in train_step
        y_pred = self(x, training=True)
    File "E:\anaconda\lib\site-packages\keras\utils\traceback_utils.py", line 67, in error_handler
        raise e.with_traceback(filtered_tb) from None
    File "E:\anaconda\lib\site-packages\keras\engine\input_spec.py", line 264, in assert_input_compatibility
        raise ValueError(f'Input {input_index} of layer "{layer_name}" is '

    ValueError: Input 1 of layer "model_12" is incompatible with the layer: expected shape=(None, 1, 12), found shape=(None, 3, 12)


  • 写回答

16条回答 默认 最新

  • 专家-赤兔[在线] 全栈领域优质创作者 2023-09-08 16:25
    关注
    获得0.30元问题酬金

    引用 皆我百晓生 小程序回复内容作答:
    根据错误信息,输入数据的shape不符合预期。模型期望的输入shape是(None, 1, 12),但实际输入的是(None, 3, 12),说明输入的数据维度不匹配。

    在构建模型时,定义了输入的shape为(3, 12),这里的3代表输入的时间步数,12是特征数。而在训练模型时,传入的训练数据trainX和trainY的shape应该为(样本数, 输入时间步数, 特征数)和(样本数, 输出时间步数, 特征数)。根据错误信息和模型定义,trainX应该是(None, 3, 12),trainY应该是(None, 1, 12)。

    因此,你需要检查一下训练数据trainX和trainY的维度是否正确,确保输入数据的shape与模型定义一致。如果根据给定的数据维度调整了模型定义,那么还需要相应地调整训练数据的维度。

    评论

报告相同问题?

问题事件

  • 系统已结题 9月16日
  • 创建了问题 9月8日

悬赏问题

  • ¥15 三极管电路求解,已知电阻电压和三级关放大倍数
  • ¥15 ADS时域 连续相位观察方法
  • ¥15 Opencv配置出错
  • ¥15 模电中二极管,三极管和电容的应用
  • ¥15 关于模型导入UNITY的.FBX: Check external application preferences.警告。
  • ¥15 气象网格数据与卫星轨道数据如何匹配
  • ¥100 java ee ssm项目 悬赏,感兴趣直接联系我
  • ¥15 微软账户问题不小心注销了好像
  • ¥15 x264库中预测模式字IPM、运动向量差MVD、量化后的DCT系数的位置
  • ¥15 curl 命令调用正常,程序调用报 java.net.ConnectException: connection refused