drlhsfqoa350437979 2016-01-18 18:20
浏览 53
已采纳

如何在Go中检测何时无法将字节转换为字符串?

There are invalid byte sequences that can't be converted to Unicode strings. How do I detect that when converting []byte to string in Go?

  • 写回答

1条回答 默认 最新

  • dpbf62565 2016-01-18 20:08
    关注

    You can, as Tim Cooper noted, test UTF-8 validity with utf8.Valid.

    But! You might be thinking that converting non-UTF-8 bytes to a Go string is impossible. In fact, "In Go, a string is in effect a read-only slice of bytes"; it can contain bytes that aren't valid UTF-8 which you can print, access via indexing, or even round-trip back to a []byte (to Write, say).

    There are two places in the language that Go does do UTF-8 decoding of strings for you.

    • when you do for i, r := range s the r is a Unicode code point as a value of type rune
    • when you do the conversion []rune(s), Go decodes the whole string to runes

    In both these instances invalid UTF-8 is replaced with U+FFFD, the replacement character reserved for uses like this. More is in the spec sections on for statements and conversions between strings and other types. These conversions never crash, so you only need to actively check for UTF-8 validity if it's relevant to your application, like if you want to throw an error on mis-encoded input.

    Since that behavior's baked into the language, you can expect it from libraries, too. U+FFFD is utf8.RuneError and returned by functions in utf8.

    Here's a sample program showing what Go does with a []byte holding invalid UTF-8:

    package main
    
    import "fmt"
    
    func main() {
        a := []byte{0xff}
        s := string(a)
        fmt.Println(s)
        for _, r := range s {
            fmt.Println(r)
        }
        rs := []rune(s)
        fmt.Println(rs)
    }
    

    Output will look different in different environments, but in the Playground it looks like

    �
    65533
    [65533]
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 乌班图ip地址配置及远程SSH
  • ¥15 怎么让点阵屏显示静态爱心,用keiluVision5写出让点阵屏显示静态爱心的代码,越快越好
  • ¥15 PSPICE制作一个加法器
  • ¥15 javaweb项目无法正常跳转
  • ¥15 VMBox虚拟机无法访问
  • ¥15 skd显示找不到头文件
  • ¥15 机器视觉中图片中长度与真实长度的关系
  • ¥15 fastreport table 怎么只让每页的最下面和最顶部有横线
  • ¥15 java 的protected权限 ,问题在注释里
  • ¥15 这个是哪里有问题啊?