kaichengw1 2021-05-07 19:30 采纳率: 50%
浏览 30

lstm训练遇到瓶颈 测试集正确率在44

 

class LSTM(nn.Module):
    def __init__(self):
        super(LSTM, self).__init__()
        self.lstm = nn.LSTM(input_size=input_size,
                            hidden_size=hidden_size,   
                            num_layers=num_layers,     
                            batch_first=True,
                            bidirectional = True)
        self.output_layer = nn.Linear(in_features=hidden_size*2, out_features=4)
        self.dropout = nn.Dropout(p=0.5)
        
    def forward(self, x):
        lstm_out, (h_n, h_c) = self.lstm(x, None)
        lstm_out = self.dropout(lstm_out)
        output = self.output_layer(lstm_out[:, -1, :])
        return output

lstm = LSTM()
lstm = lstm.float()
print(lstm)
optimizer = torch.optim.Adam(lstm.parameters(), lr=learning_rate)
loss_function = nn.CrossEntropyLoss()
for epoch in range(epoches):
        print("进行第{}个epoch".format(epoch))
        for step, (batch_x, batch_y) in enumerate(train_loader):
            optimizer.zero_grad()
            
            batch_x = batch_x.view(-1,1,300)
            output = lstm(batch_x.float())
            
            loss = loss_function(output, batch_y.long())
            loss.backward()
            optimizer.step()
            
            if step % 50 == 0:
                test_x = dev.x.view(-1,1,300)
                test_output = lstm(test_x.float())
                pred_y = torch.max(test_output, dim=1)[1].data.numpy()
                
                accuracy = ((pred_y == dev.y.data.numpy()).astype(int).sum()) / float(dev.y.size(0))
                print('Epoch: ', epoch, '| train loss: %.4f' % loss.data.numpy(), '| test accuracy: %.2f' % accuracy)

 这是一个根据文本内容分析用户地点的lstm,正确率一直没法提高,是哪里出了问题,有没有可以改进的地方?

  • 写回答

1条回答 默认 最新

  • AI Whisper 2024-07-12 01:01
    关注
    
    import torch
    import torch.nn as nn
    
    class LSTM(nn.Module):
        def __init__(self, input_size, hidden_size, num_layers, output_size):
            super(LSTM, self).__init__()
            self.hidden_size = hidden_size
            self.num_layers = num_layers
            self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True, bidirectional=True)
            self.output_layer = nn.Linear(hidden_size*2, output_size)
            self.dropout = nn.Dropout(p=0.5)
            
        def forward(self, x):
            # Initialize hidden state with zeros
            h0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(x.device)  # 2 for bidirectional
            c0 = torch.zeros(self.num_layers*2, x.size(0), self.hidden_size).to(x.device)
            
            # Forward propagate LSTM
            lstm_out, _ = self.lstm(x, (h0, c0))
            
            # Dropout and apply output layer
            lstm_out = self.dropout(lstm_out)
            output = self.output_layer(lstm_out[:, -1, :])
            return output
    
    # Example usage and training loop
    input_size = 300
    hidden_size = 128
    num_layers = 2
    output_size = 4
    learning_rate = 0.001
    epoches = 10
    
    lstm = LSTM(input_size, hidden_size, num_layers, output_size)
    print(lstm)
    
    optimizer = torch.optim.Adam(lstm.parameters(), lr=learning_rate)
    loss_function = nn.CrossEntropyLoss()
    
    # Assuming train_loader and dev (test set) are defined
    
    for epoch in range(epoches):
        print("进行第{}个epoch".format(epoch))
        for step, (batch_x, batch_y) in enumerate(train_loader):
            optimizer.zero_grad()
            
            # Reshape batch_x to match input_size
            batch_x = batch_x.view(-1, 1, input_size)
            
            # Forward pass
            output = lstm(batch_x.float())
            
            # Calculate loss
            loss = loss_function(output, batch_y.long())
            
            # Backward pass and optimize
            loss.backward()
            optimizer.step()
            
            if step % 50 == 0:
                with torch.no_grad():
                    # Evaluate on dev set
                    test_x = dev.x.view(-1, 1, input_size)
                    test_output = lstm(test_x.float())
                    _, pred_y = torch.max(test_output, dim=1)
                    
                    # Calculate accuracy
                    accuracy = (pred_y == dev.y).float().mean().item()
                    
                    print('Epoch: {}, Step: {}, Train Loss: {:.4f}, Test Accuracy: {:.2f}'.format(epoch, step, loss.item(), accuracy))
    
    

    Explanation of Improvements:****
    Class Initialization: Pass necessary parameters (input_size, hidden_size, num_layers, output_size) to LSTM class constructor to make it more flexible.
    Forward Method: Properly initialize the hidden state (h0 and c0) within the forward method and pass them to the LSTM module. This ensures correct handling of batch sizes and device placement.
    Training Loop: Use torch.no_grad() context manager for evaluating on the dev set to save memory and computations. Calculate accuracy correctly and improve print statements for clarity.
    These improvements ensure that the LSTM model is initialized and trained correctly while adhering to best practices in PyTorch programming. Adjust parameters (input_size, hidden_size, num_layers, output_size) as per your specific task requirements.

    评论

报告相同问题?

悬赏问题

  • ¥15 metadata提取的PDF元数据,如何转换为一个Excel
  • ¥15 关于arduino编程toCharArray()函数的使用
  • ¥100 vc++混合CEF采用CLR方式编译报错
  • ¥15 coze 的插件输入飞书多维表格 app_token 后一直显示错误,如何解决?
  • ¥15 vite+vue3+plyr播放本地public文件夹下视频无法加载
  • ¥15 c#逐行读取txt文本,但是每一行里面数据之间空格数量不同
  • ¥50 如何openEuler 22.03上安装配置drbd
  • ¥20 ING91680C BLE5.3 芯片怎么实现串口收发数据
  • ¥15 无线连接树莓派,无法执行update,如何解决?(相关搜索:软件下载)
  • ¥15 Windows11, backspace, enter, space键失灵