duanfu1945 2015-09-30 08:02
浏览 141
已采纳

Golang:如何从C正确解析UTF-8字符串

I'm a newbie to the go world, so maybe this is obvious.

I have a Go function which I'm exposing to C with the go build -buildmode=c-shared and corresponding //export funcName comment. (You can see it here: https://github.com/udl/bmatch/blob/master/ext/levenshtein.go#L42)

My conversion currently works like this:

func distance(s1in, s2in *C.char) int {
    s1 := C.GoString(s1in)
    s2 := C.GoString(s2in)

How would I handle UTF-8 input here? I've seen there is a UTF-8 package but I don't quite get how it works. https://golang.org/pkg/unicode/utf8/

Thank you!

  • 写回答

1条回答 默认 最新

  • duanpu6319 2015-09-30 08:21
    关注

    You don't need to do anything special. UTF-8 is Go's "native" character encoding, so you can use the functions from the utf8 package you mentioned, e.g. utf8.RuneCountInString to get the number of Unicode runes in a string. Keep in mind that len(s) will still return the number of bytes in the string.

    See this post in the official blog or this article for some details.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
编辑
预览

报告相同问题?

手机看
程序员都在用的中文IT技术交流社区

程序员都在用的中文IT技术交流社区

专业的中文 IT 技术社区,与千万技术人共成长

专业的中文 IT 技术社区,与千万技术人共成长

关注【CSDN】视频号,行业资讯、技术分享精彩不断,直播好礼送不停!

关注【CSDN】视频号,行业资讯、技术分享精彩不断,直播好礼送不停!

客服 返回
顶部