There are invalid byte sequences that can't be converted to Unicode strings. How do I detect that when converting []byte
to string
in Go?
如何在Go中检测何时无法将字节转换为字符串?
- 写回答
- 好问题 0 提建议
- 追加酬金
- 关注问题
- 邀请回答
-
1条回答 默认 最新
- dpbf62565 2016-01-18 20:08关注
You can, as Tim Cooper noted, test UTF-8 validity with
utf8.Valid
.But! You might be thinking that converting non-UTF-8 bytes to a Go
string
is impossible. In fact, "In Go, a string is in effect a read-only slice of bytes"; it can contain bytes that aren't valid UTF-8 which you can print, access via indexing, or even round-trip back to a[]byte
(toWrite
, say).There are two places in the language that Go does do UTF-8 decoding of
string
s for you.- when you do
for i, r := range s
ther
is a Unicode code point as a value of typerune
- when you do the conversion
[]rune(s)
, Go decodes the whole string to runes
In both these instances invalid UTF-8 is replaced with
U+FFFD
, the replacement character reserved for uses like this. More is in the spec sections onfor
statements and conversions betweenstring
s and other types. These conversions never crash, so you only need to actively check for UTF-8 validity if it's relevant to your application, like if you want to throw an error on mis-encoded input.Since that behavior's baked into the language, you can expect it from libraries, too.
U+FFFD
isutf8.RuneError
and returned by functions inutf8
.Here's a sample program showing what Go does with a
[]byte
holding invalid UTF-8:package main import "fmt" func main() { a := []byte{0xff} s := string(a) fmt.Println(s) for _, r := range s { fmt.Println(r) } rs := []rune(s) fmt.Println(rs) }
Output will look different in different environments, but in the Playground it looks like
� 65533 [65533]
本回答被题主选为最佳回答 , 对您是否有帮助呢?解决 无用评论 打赏 举报 - when you do
悬赏问题
- ¥15 请问在阿里云服务器中怎么利用数据库制作网站
- ¥60 ESP32怎么烧录自启动程序
- ¥50 html2canvas超出滚动条不显示
- ¥15 java业务性能问题求解(sql,业务设计相关)
- ¥15 52810 尾椎c三个a 写蓝牙地址
- ¥15 elmos524.33 eeprom的读写问题
- ¥15 使用Java milo连接Kepserver服务端报错?
- ¥15 用ADS设计一款的射频功率放大器
- ¥15 怎么求交点连线的理论解?
- ¥20 软件开发方法学习来了