dqvy87517 2016-07-25 06:40
浏览 757
已采纳

Golang正则表达式以匹配关键字对之间的多种模式

I have a string which has two keywords: "CURRENT NAME(S)" and "NEW NAME(S)" and each of these keywords are followed by a bunch of words. I want to extract those set of words beyond each of these keywords. To elaborate with a code:

    s := `"CURRENT NAME(S)
 Name1, Name2",,"NEW NAME(S)
NewName1,NewName2"`
    re := regexp.MustCompile(`"CURRENT NAME(S).*",,"NEW NAME(S).*"`)

    segs := re.FindAllString(s, -1)
    fmt.Println("segs:", segs)

    segs2 := re.FindAllStringSubmatch(s, -1)
    fmt.Println("segs2:", segs2)

As you can see, the string 's' has the input. "Name1,Name2" is the current names list and "NewName1, NewName2" is the new names list. I want to extract these two lists. The two lists are separated by a comma. Each of the keywords are beginning with a double quote and their reach ends, when their corresponding double quote ends.

What is the way to use regexp such that the program can print "Name1, Name2" and "NewName1,NewName2" ?

  • 写回答

3条回答 默认 最新

  • doushai7225 2016-07-25 08:29
    关注

    The issue with your regex is that the input string contains newline symbols, and . in Go regex does not match a newline. Another issue is that the .* is a greedy pattern and will match as many symbols as it can up to the last second keyword. Also, you need to escape parentheses in the regex pattern to match the ( and ) literal symbols.

    The best way to solve the issue is to change .* into a negated character class pattern [^"]* and place it inside a pair of non-escaped ( and ) to form a capturing group (a construct to get submatches from the match).

    Here is a Go demo:

    package main
    
    import (
        "fmt"
        "regexp"
    )
    
    func main() {
        s := `"CURRENT NAME(S)
     Name1, Name2",,"NEW NAME(S)
    NewName1,NewName2"`
        re := regexp.MustCompile(`"CURRENT NAME\(S\)\s*([^"]*)",,"NEW NAME\(S\)\s*([^"]*)"`)
    
        segs2 := re.FindAllStringSubmatch(s,-1)
        fmt.Printf("segs2: [%s; %s]", segs2[0][1], segs2[0][2])
    }
    

    Now, the regex matches:

    • "CURRENT NAME\(S\) - a literal string "CURRENT NAME(S)`
    • \s* - zero or more whitespaces
    • ([^"]*) - Group 1 capturing 0+ chars other than "
    • ",,"NEW NAME\(S\) - a literal string ",,"NEW NAME(S)
    • \s* - zero or more whitespaces
    • ([^"]*) - Group 2 capturing 0+ chars other than "
    • " - a literal "
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥20 删除和修改功能无法调用
  • ¥15 kafka topic 所有分副本数修改
  • ¥15 小程序中fit格式等运动数据文件怎样实现可视化?(包含心率信息))
  • ¥15 如何利用mmdetection3d中的get_flops.py文件计算fcos3d方法的flops?
  • ¥40 串口调试助手打开串口后,keil5的代码就停止了
  • ¥15 电脑最近经常蓝屏,求大家看看哪的问题
  • ¥60 高价有偿求java辅导。工程量较大,价格你定,联系确定辅导后将采纳你的答案。希望能给出完整详细代码,并能解释回答我关于代码的疑问疑问,代码要求如下,联系我会发文档
  • ¥50 C++五子棋AI程序编写
  • ¥30 求安卓设备利用一个typeC接口,同时实现向pc一边投屏一边上传数据的解决方案。
  • ¥15 SQL Server analysis services 服务安装失败