kaichengw1 2021-05-07 19:30 采纳率: 50%
浏览 30

lstm训练遇到瓶颈 测试集正确率在44

 

class LSTM(nn.Module):
    def __init__(self):
        super(LSTM, self).__init__()
        self.lstm = nn.LSTM(input_size=input_size,
                            hidden_size=hidden_size,   
                            num_layers=num_layers,     
                            batch_first=True,
                            bidirectional = True)
        self.output_layer = nn.Linear(in_features=hidden_size*2, out_features=4)
        self.dropout = nn.Dropout(p=0.5)
        
    def forward(self, x):
        lstm_out, (h_n, h_c) = self.lstm(x, None)
        lstm_out = self.dropout(lstm_out)
        output = self.output_layer(lstm_out[:, -1, :])
        return output

lstm = LSTM()
lstm = lstm.float()
print(lstm)
optimizer = torch.optim.Adam(lstm.parameters(), lr=learning_rate)
loss_function = nn.CrossEntropyLoss()
for epoch in range(epoches):
        print("进行第{}个epoch".format(epoch))
        for step, (batch_x, batch_y) in enumerate(train_loader):
            optimizer.zero_grad()
            
            batch_x = batch_x.view(-1,1,300)
            output = lstm(batch_x.float())
            
            loss = loss_function(output, batch_y.long())
            loss.backward()
            optimizer.step()
            
            if step % 50 == 0:
                test_x = dev.x.view(-1,1,300)
                test_output = lstm(test_x.float())
                pred_y = torch.max(test_output, dim=1)[1].data.numpy()
                
                accuracy = ((pred_y == dev.y.data.numpy()).astype(int).sum()) / float(dev.y.size(0))
                print('Epoch: ', epoch, '| train loss: %.4f' % loss.data.numpy(), '| test accuracy: %.2f' % accuracy)

 这是一个根据文本内容分析用户地点的lstm,正确率一直没法提高,是哪里出了问题,有没有可以改进的地方?

  • 写回答

1条回答 默认 最新

  • AI Whisper 2024-07-12 01:01
    关注
    
    import torch
    import torch.nn as nn
    
    class LSTM(nn.Module):
        def __init__(self, input_size, hidden_size, num_layers, output_size):
            super(LSTM, self).__init__()
            self.hidden_size = hidden_size
            self.num_layers = num_layers
            self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True, bidirectional=True)
            self.output_layer = nn.Linear(hidden_size*2, output_size)
            self.dropout = nn.Dropout(p=0.5)
            
        def forward(self, x):
            # Initialize hidden state with zeros
            h0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(x.device)  # 2 for bidirectional
            c0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(x.device)
            
            # Forward propagate LSTM
            lstm_out, _ = self.lstm(x, (h0, c0))
            
            # Dropout and apply output layer
            lstm_out = self.dropout(lstm_out)
            output = self.output_layer(lstm_out[:, -1, :])
            return output
    
    # Example usage and training loop
    input_size = 300
    hidden_size = 128
    num_layers = 2
    output_size = 4
    learning_rate = 0.001
    epoches = 10
    
    lstm = LSTM(input_size, hidden_size, num_layers, output_size)
    print(lstm)
    
    optimizer = torch.optim.Adam(lstm.parameters(), lr=learning_rate)
    loss_function = nn.CrossEntropyLoss()
    
    # Assuming train_loader and dev (test set) are defined
    
    for epoch in range(epoches):
        print("进行第{}个epoch".format(epoch))
        for step, (batch_x, batch_y) in enumerate(train_loader):
            optimizer.zero_grad()
            
            # Reshape batch_x to match input_size
            batch_x = batch_x.view(-1, 1, input_size)
            
            # Forward pass
            output = lstm(batch_x.float())
            
            # Calculate loss
            loss = loss_function(output, batch_y.long())
            
            # Backward pass and optimize
            loss.backward()
            optimizer.step()
            
            if step % 50 == 0:
                with torch.no_grad():
                    # Evaluate on dev set
                    test_x = dev.x.view(-1, 1, input_size)
                    test_output = lstm(test_x.float())
                    _, pred_y = torch.max(test_output, dim=1)
                    
                    # Calculate accuracy
                    accuracy = (pred_y == dev.y).float().mean().item()
                    
                    print('Epoch: {}, Step: {}, Train Loss: {:.4f}, Test Accuracy: {:.2f}'.format(epoch, step, loss.item(), accuracy))
    
    

    Explanation of Improvements:****
    Class Initialization: Pass necessary parameters (input_size, hidden_size, num_layers, output_size) to LSTM class constructor to make it more flexible.
    Forward Method: Properly initialize the hidden state (h0 and c0) within the forward method and pass them to the LSTM module. This ensures correct handling of batch sizes and device placement.
    Training Loop: Use torch.no_grad() context manager for evaluating on the dev set to save memory and computations. Calculate accuracy correctly and improve print statements for clarity.
    These improvements ensure that the LSTM model is initialized and trained correctly while adhering to best practices in PyTorch programming. Adjust parameters (input_size, hidden_size, num_layers, output_size) as per your specific task requirements.

    评论

报告相同问题?

悬赏问题

  • ¥15 kubeadm部署k8s出错
  • ¥15 Abaqus打不开cae文件怎么办?
  • ¥20 双系统开机引导中windows系统消失问题?
  • ¥15 小程序准备上线,软件开发公司需要提供哪些资料给甲方
  • ¥15 关于生产日期批次退货退款,库存回退的问题
  • ¥15 手机应用的时间可以修改吗
  • ¥15 docker 运行OPEN-webui异常
  • ¥15 麒麟系统如何删除光盘刻录痕迹
  • ¥15 recipe通过gem协议传的是什么
  • ¥15 TS2307: Cannot find module 'cc'.