dream0776 2019-07-31 09:47
浏览 153
已采纳

如何匹配包含Unicode字符的完整字符串?

I want to validate a string for e.g. name. A string without spaces. For normal Ascii a following regex would suffice "^\w+$" where ^ and $ takes the whole string into consideration. I tried to achieve the same result for unicode characters for supporting multiple languages using the \pL character class. But for some reason $ doesn't help match end of string. What am I doing wrong?

Code sample is here: https://play.golang.org/p/SPDEbWmqx0N

I copy pasted random characters from: http://www.columbia.edu/~fdc/utf8/

go version go1.12.5 darwin/amd64

package main

import (
    "fmt"
    "regexp"
)

func main() {

    // Unicode character class

    fmt.Println(regexp.MatchString(`^\pL+$`, "testuser"))  // expected true
    fmt.Println(regexp.MatchString(`^\pL+$`, "user with space")) // expected false 


    // Hindi script
    fmt.Println(regexp.MatchString(`^\pL+$`, "सकता")) // expected true doesn't match end of line

    // Hindi script
    fmt.Println(regexp.MatchString(`^\pL+`, "सकता")) // expected true

    // Chinese
    fmt.Println(regexp.MatchString(`^\pL+$`, "我能")) // expected true

    //French
    fmt.Println(regexp.MatchString(`^\pL+$`, "ægithaleshâtifs")) // expected true 

}
actual result:
true  <nil>
false <nil>
false <nil>
true <nil>
true <nil>
true <nil>

expected result:
true <nil>
false <nil>
true <nil>
true <nil>
true <nil>
true <nil>
  • 写回答

1条回答 默认 最新

  • doushi3715 2019-07-31 09:51
    关注

    You may use

    ^[\p{L}\p{M}]+$
    

    See Go demo.

    Details

    • ^ - start of string
    • [ - start of a character class that matches
      • \p{L} - any BMP letter
      • \p{M} - any diacritic
    • ]+ - end of the character class, repeat 1+ times
    • $ - end of string.

    If you plan to also match digits and _ as \w does, add them to the character class, ^[\p{L}\p{M}0-9_]+$ or ^[\p{L}\p{M}\p{N}_]+$.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 PADS Logic 原理图
  • ¥15 PADS Logic 图标
  • ¥15 电脑和power bi环境都是英文如何将日期层次结构转换成英文
  • ¥20 气象站点数据求取中~
  • ¥15 如何获取APP内弹出的网址链接
  • ¥15 wifi 图标不见了 不知道怎么办 上不了网 变成小地球了
  • ¥50 STM32单片机传感器读取错误
  • ¥15 (关键词-阻抗匹配,HFSS,RFID标签天线)
  • ¥15 机器人轨迹规划相关问题
  • ¥15 word样式右侧翻页键消失