普通网友 2012-11-22 10:18
浏览 676
已采纳

golang将iso8859-1转换为utf8

I am trying to convert an ISO 8859-1 encoded string to UTF-8.

The following function works with my testdata which contains german umlauts, but I'm not quite sure what source encoding the rune(b) cast assumes. Is it assuming some kind of default encoding, e.g. ISO8859-1 or is there any way to tell it what encoding to use?

func toUtf8(iso8859_1_buf []byte) string {
   var buf = bytes.NewBuffer(make([]byte, len(iso8859_1_buf)*4))
   for _, b := range(iso8859_1_buf) {
      r := rune(b)
      buf.WriteRune(r)
   }
   return string(buf.Bytes())
}
  • 写回答

2条回答 默认 最新

  • douruanfan3030 2012-11-22 11:11
    关注

    rune is an alias for int32, and when it comes to encoding, a rune is assumed to have a Unicode character value (code point). So the value b in rune(b) should be a unicode value. For 0x00 - 0xFF this value is identical to Latin-1, so you don't have to worry about it.

    Then you need to encode the runes into UTF8. But this encoding is simply done by converting a []rune to string.

    This is an example of your function without using the bytes package:

    func toUtf8(iso8859_1_buf []byte) string {
        buf := make([]rune, len(iso8859_1_buf))
        for i, b := range iso8859_1_buf {
            buf[i] = rune(b)
        }
        return string(buf)
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 运筹学排序问题中的在线排序
  • ¥15 关于docker部署flink集成hadoop的yarn,请教个问题 flink启动yarn-session.sh连不上hadoop,这个整了好几天一直不行,求帮忙看一下怎么解决
  • ¥30 求一段fortran代码用IVF编译运行的结果
  • ¥15 深度学习根据CNN网络模型,搭建BP模型并训练MNIST数据集
  • ¥15 lammps拉伸应力应变曲线分析
  • ¥15 C++ 头文件/宏冲突问题解决
  • ¥15 用comsol模拟大气湍流通过底部加热(温度不同)的腔体
  • ¥50 安卓adb backup备份子用户应用数据失败
  • ¥20 有人能用聚类分析帮我分析一下文本内容嘛
  • ¥15 请问Lammps做复合材料拉伸模拟,应力应变曲线问题