dream0776 2019-07-31 09:47
浏览 153
已采纳

如何匹配包含Unicode字符的完整字符串?

I want to validate a string for e.g. name. A string without spaces. For normal Ascii a following regex would suffice "^\w+$" where ^ and $ takes the whole string into consideration. I tried to achieve the same result for unicode characters for supporting multiple languages using the \pL character class. But for some reason $ doesn't help match end of string. What am I doing wrong?

Code sample is here: https://play.golang.org/p/SPDEbWmqx0N

I copy pasted random characters from: http://www.columbia.edu/~fdc/utf8/

go version go1.12.5 darwin/amd64

package main

import (
    "fmt"
    "regexp"
)

func main() {

    // Unicode character class

    fmt.Println(regexp.MatchString(`^\pL+$`, "testuser"))  // expected true
    fmt.Println(regexp.MatchString(`^\pL+$`, "user with space")) // expected false 


    // Hindi script
    fmt.Println(regexp.MatchString(`^\pL+$`, "सकता")) // expected true doesn't match end of line

    // Hindi script
    fmt.Println(regexp.MatchString(`^\pL+`, "सकता")) // expected true

    // Chinese
    fmt.Println(regexp.MatchString(`^\pL+$`, "我能")) // expected true

    //French
    fmt.Println(regexp.MatchString(`^\pL+$`, "ægithaleshâtifs")) // expected true 

}
actual result:
true  <nil>
false <nil>
false <nil>
true <nil>
true <nil>
true <nil>

expected result:
true <nil>
false <nil>
true <nil>
true <nil>
true <nil>
true <nil>
  • 写回答

1条回答 默认 最新

  • doushi3715 2019-07-31 09:51
    关注

    You may use

    ^[\p{L}\p{M}]+$
    

    See Go demo.

    Details

    • ^ - start of string
    • [ - start of a character class that matches
      • \p{L} - any BMP letter
      • \p{M} - any diacritic
    • ]+ - end of the character class, repeat 1+ times
    • $ - end of string.

    If you plan to also match digits and _ as \w does, add them to the character class, ^[\p{L}\p{M}0-9_]+$ or ^[\p{L}\p{M}\p{N}_]+$.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 stata安慰剂检验作图但是真实值不出现在图上
  • ¥15 c程序不知道为什么得不到结果
  • ¥40 复杂的限制性的商函数处理
  • ¥15 程序不包含适用于入口点的静态Main方法
  • ¥15 素材场景中光线烘焙后灯光失效
  • ¥15 请教一下各位,为什么我这个没有实现模拟点击
  • ¥15 执行 virtuoso 命令后,界面没有,cadence 启动不起来
  • ¥50 comfyui下连接animatediff节点生成视频质量非常差的原因
  • ¥20 有关区间dp的问题求解
  • ¥15 多电路系统共用电源的串扰问题