dongpan1416 2018-04-21 09:38
浏览 60

通过io / ioutil net / http访问XML后,在Go中解析XML

Any idea why this parsing is not working when accessing the XML directly from the site and it works when I copy and paste it into a var?

package main

import (
  "encoding/xml"
  "fmt"
  "strings"
  "io/ioutil"
  "net/http"
)

type Sitemapindex struct {
  Locations []Location `xml:"channel>item"`
}

type Location struct {
  Loc string `xml:"title"`
}

func (e Location) String () string {
  return fmt.Sprintf(e.Loc)
}

func main() {
  resp, _ := http.Get("https://www.sec.gov/Archives/edgar/xbrlrss.all.xml")
  bytes, _ := ioutil.ReadAll(resp.Body)  
  string_body := string(bytes)  
  var s Sitemapindex
  decoder := xml.NewDecoder(strings.NewReader(string_body))
  decoder.Strict = false
  decoder.Decode(&s)
  fmt.Println(s)
}
  • 写回答

1条回答 默认 最新

  • duanbenzan4050 2018-04-21 09:49
    关注

    The content you're parsing is encoded as windows-1252. To properly decode this data, the XML decoder needs to be parameterized by a charset reader that can read the specified charset.

    import (
        "encoding/xml"
        "golang.org/x/net/html/charset"
    )
    
    decoder := xml.NewDecoder(reader)
    decoder.CharsetReader = charset.NewReaderLabel
    err := decoder.Decode(&s)
    

    I guess that the error returned at your attempt to decode the data tells something similar.

    评论

报告相同问题?

悬赏问题

  • ¥15 微信小程序协议怎么写
  • ¥15 c语言怎么用printf(“\b \b”)与getch()实现黑框里写入与删除?
  • ¥20 怎么用dlib库的算法识别小麦病虫害
  • ¥15 华为ensp模拟器中S5700交换机在配置过程中老是反复重启
  • ¥15 java写代码遇到问题,求帮助
  • ¥15 uniapp uview http 如何实现统一的请求异常信息提示?
  • ¥15 有了解d3和topogram.js库的吗?有偿请教
  • ¥100 任意维数的K均值聚类
  • ¥15 stamps做sbas-insar,时序沉降图怎么画
  • ¥15 买了个传感器,根据商家发的代码和步骤使用但是代码报错了不会改,有没有人可以看看