duanfu1945 2015-09-30 16:02
浏览 141
已采纳

Golang:如何从C正确解析UTF-8字符串

I'm a newbie to the go world, so maybe this is obvious.

I have a Go function which I'm exposing to C with the go build -buildmode=c-shared and corresponding //export funcName comment. (You can see it here: https://github.com/udl/bmatch/blob/master/ext/levenshtein.go#L42)

My conversion currently works like this:

func distance(s1in, s2in *C.char) int {
    s1 := C.GoString(s1in)
    s2 := C.GoString(s2in)

How would I handle UTF-8 input here? I've seen there is a UTF-8 package but I don't quite get how it works. https://golang.org/pkg/unicode/utf8/

Thank you!

  • 写回答

1条回答 默认 最新

  • duanpu6319 2015-09-30 16:21
    关注

    You don't need to do anything special. UTF-8 is Go's "native" character encoding, so you can use the functions from the utf8 package you mentioned, e.g. utf8.RuneCountInString to get the number of Unicode runes in a string. Keep in mind that len(s) will still return the number of bytes in the string.

    See this post in the official blog or this article for some details.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?