dsfsad089111 2018-04-05 16:13
浏览 39
已采纳

正则表达式将测试Go中的拉丁字母

I'm trying to write a regex in Go to test for Latin letters only.

I know that \p{Latin} matches with any Latin script characters, but it also matches things such as Roman Numerals (e.g. "ⅻ"). That leads me to \p{L} which matches Unicode letters, but it matches any script, not just Latin.

Best I've been able to come with so far is two regexes with an &&:

latinRe := regexp.MustCompile(`\p{Latin}`)
letterRe := regexp.MustCompile(`\p{L}`)
if latinRe.Matches(testString) && letterRe.Matches(testString) {...}

I'm not happy that I can't test this as easily using something like regex101.com. Is there a better way? More succinct? Performant?

  • 写回答

1条回答 默认 最新

  • duanbei3704 2018-04-05 16:20
    关注

    You can use a range like the following to specify all the characters you want to match. Depending on the regex engine, one of the following should work:

    See regex in use here: Adapted from this link

    [A-Za-z\u00C0-\u00D6\u00D8-\u00f6\u00f8-\u00ff]
    [A-Za-z\xC0-\xD6\xD8-\xf6\xf8-\xff]
    

    Another option is to negate specific characters from a Unicode character class:

    See regex in use here

    [^\P{Latin}\p{N}]
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 多址通信方式的抗噪声性能和系统容量对比
  • ¥15 winform的chart曲线生成时有凸起
  • ¥15 msix packaging tool打包问题
  • ¥15 finalshell节点的搭建代码和那个端口代码教程
  • ¥15 Centos / PETSc / PETGEM
  • ¥15 centos7.9 IPv6端口telnet和端口监控问题
  • ¥20 完全没有学习过GAN,看了CSDN的一篇文章,里面有代码但是完全不知道如何操作
  • ¥15 使用ue5插件narrative时如何切换关卡也保存叙事任务记录
  • ¥20 海浪数据 南海地区海况数据,波浪数据
  • ¥20 软件测试决策法疑问求解答