kaichengw1 2021-05-07 19:30 采纳率: 50%
浏览 30

lstm训练遇到瓶颈 测试集正确率在44

 

class LSTM(nn.Module):
    def __init__(self):
        super(LSTM, self).__init__()
        self.lstm = nn.LSTM(input_size=input_size,
                            hidden_size=hidden_size,   
                            num_layers=num_layers,     
                            batch_first=True,
                            bidirectional = True)
        self.output_layer = nn.Linear(in_features=hidden_size*2, out_features=4)
        self.dropout = nn.Dropout(p=0.5)
        
    def forward(self, x):
        lstm_out, (h_n, h_c) = self.lstm(x, None)
        lstm_out = self.dropout(lstm_out)
        output = self.output_layer(lstm_out[:, -1, :])
        return output

lstm = LSTM()
lstm = lstm.float()
print(lstm)
optimizer = torch.optim.Adam(lstm.parameters(), lr=learning_rate)
loss_function = nn.CrossEntropyLoss()
for epoch in range(epoches):
        print("进行第{}个epoch".format(epoch))
        for step, (batch_x, batch_y) in enumerate(train_loader):
            optimizer.zero_grad()
            
            batch_x = batch_x.view(-1,1,300)
            output = lstm(batch_x.float())
            
            loss = loss_function(output, batch_y.long())
            loss.backward()
            optimizer.step()
            
            if step % 50 == 0:
                test_x = dev.x.view(-1,1,300)
                test_output = lstm(test_x.float())
                pred_y = torch.max(test_output, dim=1)[1].data.numpy()
                
                accuracy = ((pred_y == dev.y.data.numpy()).astype(int).sum()) / float(dev.y.size(0))
                print('Epoch: ', epoch, '| train loss: %.4f' % loss.data.numpy(), '| test accuracy: %.2f' % accuracy)

 这是一个根据文本内容分析用户地点的lstm,正确率一直没法提高,是哪里出了问题,有没有可以改进的地方?

  • 写回答

1条回答 默认 最新

  • AI Whisper 2024-07-12 01:01
    关注
    
    import torch
    import torch.nn as nn
    
    class LSTM(nn.Module):
        def __init__(self, input_size, hidden_size, num_layers, output_size):
            super(LSTM, self).__init__()
            self.hidden_size = hidden_size
            self.num_layers = num_layers
            self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True, bidirectional=True)
            self.output_layer = nn.Linear(hidden_size*2, output_size)
            self.dropout = nn.Dropout(p=0.5)
            
        def forward(self, x):
            # Initialize hidden state with zeros
            h0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(x.device)  # 2 for bidirectional
            c0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(x.device)
            
            # Forward propagate LSTM
            lstm_out, _ = self.lstm(x, (h0, c0))
            
            # Dropout and apply output layer
            lstm_out = self.dropout(lstm_out)
            output = self.output_layer(lstm_out[:, -1, :])
            return output
    
    # Example usage and training loop
    input_size = 300
    hidden_size = 128
    num_layers = 2
    output_size = 4
    learning_rate = 0.001
    epoches = 10
    
    lstm = LSTM(input_size, hidden_size, num_layers, output_size)
    print(lstm)
    
    optimizer = torch.optim.Adam(lstm.parameters(), lr=learning_rate)
    loss_function = nn.CrossEntropyLoss()
    
    # Assuming train_loader and dev (test set) are defined
    
    for epoch in range(epoches):
        print("进行第{}个epoch".format(epoch))
        for step, (batch_x, batch_y) in enumerate(train_loader):
            optimizer.zero_grad()
            
            # Reshape batch_x to match input_size
            batch_x = batch_x.view(-1, 1, input_size)
            
            # Forward pass
            output = lstm(batch_x.float())
            
            # Calculate loss
            loss = loss_function(output, batch_y.long())
            
            # Backward pass and optimize
            loss.backward()
            optimizer.step()
            
            if step % 50 == 0:
                with torch.no_grad():
                    # Evaluate on dev set
                    test_x = dev.x.view(-1, 1, input_size)
                    test_output = lstm(test_x.float())
                    _, pred_y = torch.max(test_output, dim=1)
                    
                    # Calculate accuracy
                    accuracy = (pred_y == dev.y).float().mean().item()
                    
                    print('Epoch: {}, Step: {}, Train Loss: {:.4f}, Test Accuracy: {:.2f}'.format(epoch, step, loss.item(), accuracy))
    
    

    Explanation of Improvements:****
    Class Initialization: Pass necessary parameters (input_size, hidden_size, num_layers, output_size) to LSTM class constructor to make it more flexible.
    Forward Method: Properly initialize the hidden state (h0 and c0) within the forward method and pass them to the LSTM module. This ensures correct handling of batch sizes and device placement.
    Training Loop: Use torch.no_grad() context manager for evaluating on the dev set to save memory and computations. Calculate accuracy correctly and improve print statements for clarity.
    These improvements ensure that the LSTM model is initialized and trained correctly while adhering to best practices in PyTorch programming. Adjust parameters (input_size, hidden_size, num_layers, output_size) as per your specific task requirements.

    评论

报告相同问题?

悬赏问题

  • ¥30 Matlab打开默认名称带有/的光谱数据
  • ¥50 easyExcel模板 动态单元格合并列
  • ¥15 res.rows如何取值使用
  • ¥15 在odoo17开发环境中,怎么实现库存管理系统,或独立模块设计与AGV小车对接?开发方面应如何设计和开发?请详细解释MES或WMS在与AGV小车对接时需完成的设计和开发
  • ¥15 CSP算法实现EEG特征提取,哪一步错了?
  • ¥15 游戏盾如何溯源服务器真实ip?需要30个字。后面的字是凑数的
  • ¥15 vue3前端取消收藏的不会引用collectId
  • ¥15 delphi7 HMAC_SHA256方式加密
  • ¥15 关于#qt#的问题:我想实现qcustomplot完成坐标轴
  • ¥15 下列c语言代码为何输出了多余的空格