duanpang2751
duanpang2751
2016-08-31 05:42

ISO-8859-1编码网站中的Umlauts

  • encoding
  • http
已采纳

My very simple code snippet:

import "net/http"
import "io"
import "os"

func main() {
  resp, err := http.Get("http://example.com")
  if err == nil {
    io.Copy(os.Stdout, resp.Body)
  }
}

When example.com is charset=iso-8859-1 encoded my output is faulty. Umlauts for example are not displayed correctly:

Hällo Wörld --> H?llo W?rld

Whats a good solution to display umlauts correctly??

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

1条回答

  • dongxun1244 dongxun1244 5年前

    You can use the package golang.org/x/net/html/charset to determine the encoding of the website, and also create a reader that converts the content to UTF-8.

    Below is a working example:

    package main
    
    import (
        "io"
        "net/http"
        "os"
    
        "golang.org/x/net/html/charset"
    )
    
    func main() {
        resp, err := http.Get("http://example.com")
        if err != nil {
            os.Exit(1)
        }
    
        r, err := charset.NewReader(resp.Body, resp.Header.Get("Content-Type"))
        if err != nil {
            os.Exit(1)
        }
    
        io.Copy(os.Stdout, r)
    }
    
    点赞 评论 复制链接分享

为你推荐