dqwmhrxt68679
2019-04-10 18:21
浏览 499
已采纳

在Go中将带有UTF-8字节字符串的命令行输出转换为Unicode代码点

I am running an executable from Go via os.Exec, which gives me the following output: (\\xe2\\x96\\xb2). The output contains a UTF-8 byte string, which I want to convert to the corresponding Unicode codepoint (U+25B2). What I am expecting to see, or trying to convert to is: "(▲)". I have looked at this entry in the Go Blog (https://blog.golang.org/strings), but it starts out with an Interpreted string literal, whereas the command output seems to be a Raw string literal. I have tried strconv.Quote and strconv.Unquote, which does not achieve what I'm looking for.

图片转代码服务由CSDN问答提供 功能建议

我正在通过os.Exec从Go运行可执行文件,它为我提供了以下输出:(\ \ XE2 \\ X96 \\ XB2)。 输出包含一个UTF-8字节字符串,我想将其转换为相应的Unicode代码点(U + 25B2)。 我期望看到或尝试转换为:“(▲)”。 我已经在Go Blog中查看了此条目( https://blog.golang.org/strings ),但它以Interpreted字符串文字开头,而命令输出似乎是Raw字符串文字。 我已经尝试过 strconv.Quote strconv.Unquote ,它们无法实现我想要的功能。

  • 写回答
  • 关注问题
  • 收藏
  • 邀请回答

1条回答 默认 最新

  • douyong4623 2019-04-10 21:30
    已采纳

    You can use the strconv package to parse the string literal containing the escape sequences.

    The quick and dirty way is to simply add the missing quotes and interpret it as a quoted string using strconv.Unquote

    s := `\xe2\x96\xb2`
    s, err := strconv.Unquote(`"` + s + `"`)
    

    You can also directly parse the string one character at a time (which is what Unquote does internally), using strconv.UnquoteChar

    s := `\xe2\x96\xb2`
    buf := make([]byte, 0, 3*len(s)/2)
    for len(s) > 0 {
        c, _, ss, err := strconv.UnquoteChar(s, 0)
        if err != nil {
            log.Fatal(err)
        }
        s = ss
        buf = append(buf, byte(c))
    }
    s = string(buf)
    

    https://play.golang.org/p/6SDij9d-aRr

    已采纳该答案
    打赏 评论

相关推荐 更多相似问题