duandiaoqian5795 2018-10-19 14:25
浏览 646

将utf-8转换为单字节编码

I have a batch of wrongfully encoded records. This one-liner gives me out a correct result

cat example.txt | iconv -f utf-8 -t iso8859-2

But the following program give me an error encoding: rune not supported by encoding.

func main() {
    s:= []byte {196, 144, 194, 154, 196, 144, 194, 176, 196, 144, 197, 186, 196, 144, 196, 190, 197, 131, 194, 128, 196, 144, 194, 176, 32, 52, 52, 53, 54, 50, 53, 54, 10, 10, 0, 0, }
    fmt.Println(s)

    dec := charmap.ISO8859_2.NewEncoder()
    out, err := dec.Bytes(s)
    if err != nil {
        fmt.Println(err)
        return
    }
    expectedOutput := "Камера 4456256"      
    fmt.Println("result", string(out), "expect:", expectedOutput)
}

I'm wondering if my problem can be resolved without iconv bindings ?

  • 写回答

1条回答 默认 最新

  • duanmu2013 2018-10-19 14:57
    关注

    Searching for charmap.ISO8859_2 gives the expression, that your are using golang.org/x/text.

    Here we see how the transformation is done, given a Charmap:

    https://github.com/golang/text/blob/4d1c5fb19474adfe9562c9847ba425e7da817e81/encoding/charmap/charmap.go#L206

    The specific line highlights where the error comes from. So your input contains characters in utf8 which can't be represented in iso8859-2 or invalid utf8.

    Here you see, that the error is handed to you faithfully and the usage of replacement inside the RepertoireError seems to be a red herring.

    Of course you don't need iconv bindings. You can just iterate through your input character by character and encode it as iso8859-2 and decide yourself, what to do with unrepresentable characters.

    评论

报告相同问题?

悬赏问题

  • ¥15 使用C#,asp.net读取Excel文件并保存到Oracle数据库
  • ¥15 C# datagridview 单元格显示进度及值
  • ¥15 thinkphp6配合social login单点登录问题
  • ¥15 HFSS 中的 H 场图与 MATLAB 中绘制的 B1 场 部分对应不上
  • ¥15 如何在scanpy上做差异基因和通路富集?
  • ¥20 关于#硬件工程#的问题,请各位专家解答!
  • ¥15 关于#matlab#的问题:期望的系统闭环传递函数为G(s)=wn^2/s^2+2¢wn+wn^2阻尼系数¢=0.707,使系统具有较小的超调量
  • ¥15 FLUENT如何实现在堆积颗粒的上表面加载高斯热源
  • ¥30 截图中的mathematics程序转换成matlab
  • ¥15 动力学代码报错,维度不匹配