The Go Programming Language Specification
Conversions
Conversions to and from a string type
Converting a signed or unsigned integer value to a string type yields
a string containing the UTF-8 representation of the integer. Values
outside the range of valid Unicode code points are converted to
"\uFFFD".
Converting a slice of runes to a string type yields a string that is
the concatenation of the individual rune values converted to strings.
Type byte
in Go is an alias for type uint8
.
Type rune
, a Unicode code point (24-bit unsigned integer), is an alias for int32
.
Go encodes Unicode code points (rune
s) as UTF-8 encoded string
s.
For your example,
package main
import (
"fmt"
"unicode"
)
func main() {
// Unicode code points are 24-bit unsigned integers
runes := make([]rune, 3)
runes[0] = 97
runes[1] = -22 // invalid Unicode code point
runes[2] = 99
fmt.Println(runes)
// Encode Unicode code points as UTF-8
// Invalid code points converted to Unicode replacement character (U+FFFD)
s := string(runes)
fmt.Println(s)
// Decode UTF-8 as Unicode code points
for _, r := range s {
fmt.Println(r, string(r), r == unicode.ReplacementChar)
}
}
Playground: https://play.golang.org/p/AZUBd2iZp1F
Output:
[97 -22 99]
a�c
97 a false
65533 � true
99 c false
References:
The Go Programming Language Specification
The Go Blog: Strings, bytes, runes and characters in Go
The Unicode Consortium