duanfen2008 2015-03-20 07:13
浏览 513
已采纳

如何在go中获取字符的Unicode值?

I try to get the unicode value of a string character in Go as an Int value.

I do this:

value = strconv.Itoa(int(([]byte(char))[0]))

where char contains a string with one character.

That works for many cases. It doesn't work for umlauts like ä, ö, ü, Ä, Ö, Ü.

E.g. Ä results in 65, which is the same as for A.

How can I do that?

Supplement: I had two problems. The first was solved with any of the answers below. The second was a bit more tricky. My input was not Go normalized UTF-8 code, e.g. umlauts were represented by two characters instead of one. As ANisus said the solution is found in the package golang.org/x/text/unicode/norm. The line above is now two lines:

rune, _ := utf8.DecodeRune(norm.NFC.Bytes([]byte(char)))
value = strconv.Itoa(int(rune)) 

Any hints to make this shorter welcome ...

  • 写回答

3条回答 默认 最新

  • duanhuang1967 2015-03-20 07:24
    关注

    Strings are utf8 encoded, so to decode a character from a string to get the rune (unicode code point), you can use the unicode/utf8 package.

    Example:

    package main
    
    import (
        "fmt"
        "unicode/utf8"
    )
    
    func main() {
        str := "AÅÄÖ"
    
        for len(str) > 0 {
            r, size := utf8.DecodeRuneInString(str)
            fmt.Printf("%d %v
    ", r, size)
    
            str = str[size:]
        }
    }
    

    Result:

    65 1
    197 2
    196 2
    214 2

    Edit: (To clarify Michael's supplement)

    A character such as Ä may be created using different unicode code points:

    Precomposed: Ä (U+00C4)
    Using combining diaeresis: A (U+0041) + ¨ (U+0308)

    In order to get the precomposed form, one can use the normalization package, golang.org/x/text/unicode/norm. The NFC (Canonical Decomposition, followed by Canonical Composition) form will turn U+0041 + U+0308 into U+00C4:

    c := "\u0041\u0308"
    r, _ := utf8.DecodeRune(norm.NFC.Bytes([]byte(c)))
    fmt.Printf("%+q", r) // '\u00c4'
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 matlab数字图像处理频率域滤波
  • ¥15 在abaqus做了二维正交切削模型,给刀具添加了超声振动条件后输出切削力为什么比普通切削增大这么多
  • ¥15 ELGamal和paillier计算效率谁快?
  • ¥15 file converter 转换格式失败 报错 Error marking filters as finished,如何解决?
  • ¥15 ubuntu系统下挂载磁盘上执行./提示权限不够
  • ¥15 Arcgis相交分析无法绘制一个或多个图形
  • ¥15 关于#r语言#的问题:差异分析前数据准备,报错Error in data[, sampleName1] : subscript out of bounds请问怎么解决呀以下是全部代码:
  • ¥15 seatunnel-web使用SQL组件时候后台报错,无法找到表格
  • ¥15 fpga自动售货机数码管(相关搜索:数字时钟)
  • ¥15 用前端向数据库插入数据,通过debug发现数据能走到后端,但是放行之后就会提示错误