lstm训练遇到瓶颈测试集正确率在44

class LSTM(nn.Module):
    def __init__(self):
        super(LSTM, self).__init__()
        self.lstm = nn.LSTM(input_size=input_size,
                            hidden_size=hidden_size,   
                            num_layers=num_layers,     
                            batch_first=True,
                            bidirectional = True)
        self.output_layer = nn.Linear(in_features=hidden_size*2, out_features=4)
        self.dropout = nn.Dropout(p=0.5)
        
    def forward(self, x):
        lstm_out, (h_n, h_c) = self.lstm(x, None)
        lstm_out = self.dropout(lstm_out)
        output = self.output_layer(lstm_out[:, -1, :])
        return output

lstm = LSTM()
lstm = lstm.float()
print(lstm)
optimizer = torch.optim.Adam(lstm.parameters(), lr=learning_rate)
loss_function = nn.CrossEntropyLoss()
for epoch in range(epoches):
        print("进行第{}个epoch".format(epoch))
        for step, (batch_x, batch_y) in enumerate(train_loader):
            optimizer.zero_grad()
            
            batch_x = batch_x.view(-1,1,300)
            output = lstm(batch_x.float())
            
            loss = loss_function(output, batch_y.long())
            loss.backward()
            optimizer.step()
            
            if step % 50 == 0:
                test_x = dev.x.view(-1,1,300)
                test_output = lstm(test_x.float())
                pred_y = torch.max(test_output, dim=1)[1].data.numpy()
                
                accuracy = ((pred_y == dev.y.data.numpy()).astype(int).sum()) / float(dev.y.size(0))
                print('Epoch: ', epoch, '| train loss: %.4f' % loss.data.numpy(), '| test accuracy: %.2f' % accuracy)

这是一个根据文本内容分析用户地点的lstm，正确率一直没法提高，是哪里出了问题，有没有可以改进的地方？

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

AI Whisper 2024-07-12 01:01

关注


import torch
import torch.nn as nn

class LSTM(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, output_size):
        super(LSTM, self).__init__()
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True, bidirectional=True)
        self.output_layer = nn.Linear(hidden_size*2, output_size)
        self.dropout = nn.Dropout(p=0.5)
        
    def forward(self, x):
        # Initialize hidden state with zeros
        h0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(x.device)  # 2 for bidirectional
        c0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(x.device)
        
        # Forward propagate LSTM
        lstm_out, _ = self.lstm(x, (h0, c0))
        
        # Dropout and apply output layer
        lstm_out = self.dropout(lstm_out)
        output = self.output_layer(lstm_out[:, -1, :])
        return output

# Example usage and training loop
input_size = 300
hidden_size = 128
num_layers = 2
output_size = 4
learning_rate = 0.001
epoches = 10

lstm = LSTM(input_size, hidden_size, num_layers, output_size)
print(lstm)

optimizer = torch.optim.Adam(lstm.parameters(), lr=learning_rate)
loss_function = nn.CrossEntropyLoss()

# Assuming train_loader and dev (test set) are defined

for epoch in range(epoches):
    print("进行第{}个epoch".format(epoch))
    for step, (batch_x, batch_y) in enumerate(train_loader):
        optimizer.zero_grad()
        
        # Reshape batch_x to match input_size
        batch_x = batch_x.view(-1, 1, input_size)
        
        # Forward pass
        output = lstm(batch_x.float())
        
        # Calculate loss
        loss = loss_function(output, batch_y.long())
        
        # Backward pass and optimize
        loss.backward()
        optimizer.step()
        
        if step % 50 == 0:
            with torch.no_grad():
                # Evaluate on dev set
                test_x = dev.x.view(-1, 1, input_size)
                test_output = lstm(test_x.float())
                _, pred_y = torch.max(test_output, dim=1)
                
                # Calculate accuracy
                accuracy = (pred_y == dev.y).float().mean().item()
                
                print('Epoch: {}, Step: {}, Train Loss: {:.4f}, Test Accuracy: {:.2f}'.format(epoch, step, loss.item(), accuracy))

Explanation of Improvements:****
Class Initialization: Pass necessary parameters (input_size, hidden_size, num_layers, output_size) to LSTM class constructor to make it more flexible.
Forward Method: Properly initialize the hidden state (h0 and c0) within the forward method and pass them to the LSTM module. This ensures correct handling of batch sizes and device placement.
Training Loop: Use torch.no_grad() context manager for evaluating on the dev set to save memory and computations. Calculate accuracy correctly and improve print statements for clarity.
These improvements ensure that the LSTM model is initialized and trained correctly while adhering to best practices in PyTorch programming. Adjust parameters (input_size, hidden_size, num_layers, output_size) as per your specific task requirements.

报告相同问题？

关注问题

LSTM模型可以训练怎样的数据集？人工智能深度学习神经网络自然语言处理
2020-04-18 17:08

回答 1 已采纳 LSTM主要用来学习序列，并且序列的变化的影响因子和序列本身有关或者周期性波动。比如说文本预测，每天每周每年的销量别的也可以学，但是效果未必很好。
关于#lstm#的问题：lstm训练，padding 补0后,模型不收敛 lstm pytorch 时序数据库
2022-07-20 18:43

回答 2 已采纳直接划分60s滑动窗口不行嘛
LSTM的loss不断下降，但train和test的准确率始终在0.5左右 tensorflow 机器学习深度学习神经网络自然语言处理
2019-07-19 10:18

回答 3 已采纳 LSTM是用来做文本生成，做垃圾邮件识别似乎没有什么道理。你的loss用的可能是MSE，平方误差对于大的误差的减小比较敏感，但是对于最终的分类没有什么帮助，就导致acc没有什么变化loss一直下降
在训练LSTM网络时，有哪些常见的问题和解决方案
2024-08-31 00:00

借雨醉东风的博客在训练LSTM（Long Short-Term Memory）网络时，可能会遇到一些常见的问题，以下是这些问题及其解决方案：
找的lstm模型里没有学习率这个参数 python
2021-07-21 10:15

回答 2 已采纳不知道你这个问题是否已经解决, 如果还没有解决的话: 请看👉 ：深度学习-利用LSTM预测多输出如果你已经解决了该问题, 非常希望你能够分享一下解决方案, 以帮助更多的人 ^-^
LSTM模型训练过拟合问题。 python 人工智能深度学习神经网络
2020-09-15 10:21

回答 1 已采纳没办法，你的训练样本太少。增加训练样本是唯一的办法。好比吃不饱饭怎么办，不增加饭，采用稀饭掺水，少餐多顿这些办法都治标不治本。所以你看即便it大厂，也在拼命积攒数据以及人工标注数据，花费大
使用Keras编写的LSTM，训练时出现loss: nan - val_loss: nan，该如何调整？ keras lstm 有问必答深度学习
2022-05-03 18:40

回答 2 已采纳原数据第一列是时间形式的20220503这种，在读数据之后进行下面操作，你的数据是简单的1 2 34这种，数据可能在处理过程丢失或者变成nan了，你可以一步步看看每次处理后当前的数据现在是什么格式 d
LSTM 升级了？ xLSTM 来挑战现状了
2024-05-22 07:45

茶桁的博客在人工智能领域，它是一种稳健的替代方案，尤其适用于需要高效长期依赖性管理的任务。这一演变表明，递归神经网络的未来大有可为，可增强其在实时语言处理和复杂数据序列预测等各个领域的适用性。尽管 xLSTM 有所...
LSTM模型如何进行新数据的预测？ python 人工智能机器学习深度学习神经网络
2019-07-04 15:21

回答 2 已采纳创建一个预测数组，每预测一个Y就往数组里放一个，同时更新你用来预测的自变量X数组，剔除最早的X，把预测值加入到X里，依次往后预测
请问命名实体识别任务中如何在bilstm前加入embedding？ lstm nlp 自然语言处理
2022-01-17 21:39

回答 1 已采纳可以把BERT的Embedding层拿来用，也可以把BEET的输出视为embedding，也可以自己使用NN..Embedding定义一层Embedding层，自己进行训练
tensorflow RNN LSTM代码运行不正确？ tensorflow 人工智能深度学习神经网络
2019-10-04 20:43

回答 2 已采纳试着把X和Y定义placeholder时的维度，由batch_size换成None. 因为你这样是固定了传入的数据集大小，在测试集时的维度是10000个，而不是batch-size个
基于LSTM-AutoEncoder的室内空气质量时间序列数据异常检测
2024-08-30 15:13

beegreen的博客异常检测，自动编码器（AutoEncoder），CO2，室内空气质量（IAQ），长短期记忆（LSTM），时间序列。
pytorch训练LSTM模型的代码疑问 python 人工智能深度学习神经网络
2019-08-09 11:55

回答 2 已采纳 ``` def __init__(self): super(Sequence,self).__init__() self.lstm1 = nn.LSTMCel
【深度学习】在PyTorch中使用 LSTM 进行新冠病例预测
2021-12-28 12:00

风度78的博客时间序列数据，顾名思义是一种随时间变化的数据。例如，24 小时时间段内的温度，一个月内各种产品的价格，特定公司一年内的股票价格。长短期记忆网络(LSTM)等高级深度学习模型能够捕捉时间序列...
人工智能与自然语言处理发展史
2024-09-24 01:27

多吃轻食的博客在人工智能发展了60多年后，机器虽然可以在某些方面超越人类，但想让机器真正通过图灵测试，具备真正意义上的人类智能，这个目标看上去仍然遥遥无期。
没有解决我的问题, 去提问

悬赏问题

¥30 Matlab打开默认名称带有/的光谱数据
¥50 easyExcel模板动态单元格合并列
¥15 res.rows如何取值使用
¥15 在odoo17开发环境中，怎么实现库存管理系统，或独立模块设计与AGV小车对接？开发方面应如何设计和开发？请详细解释MES或WMS在与AGV小车对接时需完成的设计和开发
¥15 CSP算法实现EEG特征提取，哪一步错了？
¥15 游戏盾如何溯源服务器真实ip?需要30个字。后面的字是凑数的
¥15 vue3前端取消收藏的不会引用collectId
¥15 delphi7 HMAC_SHA256方式加密
¥15 关于#qt#的问题：我想实现qcustomplot完成坐标轴
¥15 下列c语言代码为何输出了多余的空格

码龄粉丝数原力等级 --

lstm训练遇到瓶颈测试集正确率在44

1条回答默认最新

码龄粉丝数原力等级 --

悬赏问题

lstm训练遇到瓶颈 测试集正确率在44

1条回答 默认 最新

悬赏问题

lstm训练遇到瓶颈测试集正确率在44

1条回答默认最新