dtl85148 2016-11-05 06:00
浏览 348
已采纳

如何使用正则表达式捕获“多个”重复组

I have the following text file I would like to parse out to get the individual fields:

host_group_web = ( )
host_group_lbnorth = ( lba050 lbhou002 lblon003 )

The fields that I would like to extract are in bold

  • host_group_web = ( )
  • host_group_lbnorth = ( lba505 lbhou002 lblon003 )

host_group_web has no items in between the ( ), so that portion would be ignored

I've named the first group as nodegroup and the items in between the () as nodes

I am reading the file line by line, and storing the results for further processing.

In Golang, This is the snippet of Regex I am using:

hostGroupLine := "host_group_lbnorth = ( lba050 lbhou002 lblon003 )"
hostGroupExp := regexp.MustCompile(`host_group_(?P<nodegroup>[[:alnum:]]+)\s*=\s*\(\s*(?P<nodes>[[:alnum:]]+\s*)`)
hostGroupMatch := hostGroupExp.FindStringSubmatch(hostGroupLine)

for i, name := range hostGroupExp.SubexpNames() {
  if i != 0 {
    fmt.Println("GroupName:", name, "GroupMatch:", hostGroupMatch[i])
  }
}

I get the following output, which is missing the rest of the matches for the nodes named group.

GroupName: nodegroup GroupMatch: lbnorth
GroupName: nodes GroupMatch: lba050

The Snippet in Golang Playground

My question is, how do I get a Regex in Golang that would match the nodegroup and all the nodes that maybe in the line, e.g lba050 lbhou002 lblon003. The amount of nodes will vary, from 0 - as many.

  • 写回答

1条回答 默认 最新

  • doutu3352 2016-11-05 19:18
    关注

    If you want to capture the group name and all possible node names, you should work with a different regex pattern. This one should capture all of them in one go. No need to work with named capture groups but you can if you want to.

    hostGroupExp := regexp.MustCompile(`host_group_([[:alnum:]]+)|([[:alnum:]]+) `)
    
    hostGroupLine := "host_group_lbnorth = ( lba050 lbhou002 lblon003 )"
    hostGroupMatch := hostGroupExp.FindAllStringSubmatch(hostGroupLine, -1)
    
    fmt.Printf("GroupName: %s
    ", hostGroupMatch[0][1])
    for i := 1; i < len(hostGroupMatch); i++ {
        fmt.Printf("  Node: %s
    ", hostGroupMatch[i][2])
    }
    

    See it in action in playground

    Alternative:

    You can also work the way awk would do the parsing: use a regexp expression to split the lines in tokens and print the tokens you need. Of course the line layout should be the same as the one given in your example.

    package main
    
    import (
        "fmt"
        "regexp"
    )
    
    func printGroupName(tokens []string) {
        fmt.Printf("GroupName: %s
    ", tokens[2])
        for i := 5; i < len(tokens)-1; i++ {
            fmt.Printf("  Node: %s
    ", tokens[i])
        }
    }
    
    func main() {
    
        // regexp line splitter (either _ or space)
        r := regexp.MustCompile(`_| `)
    
        // lines to parse
        hostGroupLines := []string{
            "host_group_lbnorth = ( lba050 lbhou002 lblon003 )",
            "host_group_web = ( web44 web125 )",
            "host_group_web = ( web44 )",
            "host_group_lbnorth = ( )",
        }
    
        // split lines on regexp splitter and print result
        for _, line := range hostGroupLines {
            hostGroupMatch := r.Split(line, -1)
            printGroupName(hostGroupMatch)
        }
    
    }
    

    See it in action in playground

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 mmocr的训练错误,结果全为0
  • ¥15 python的qt5界面
  • ¥15 无线电能传输系统MATLAB仿真问题
  • ¥50 如何用脚本实现输入法的热键设置
  • ¥20 我想使用一些网络协议或者部分协议也行,主要想实现类似于traceroute的一定步长内的路由拓扑功能
  • ¥30 深度学习,前后端连接
  • ¥15 孟德尔随机化结果不一致
  • ¥15 apm2.8飞控罗盘bad health,加速度计校准失败
  • ¥15 求解O-S方程的特征值问题给出边界层布拉休斯平行流的中性曲线
  • ¥15 谁有desed数据集呀