dth8312 2013-09-03 03:45
浏览 9
已采纳

等同于Go中Python的HTML解析功能/模块?

I'm now learning Go myself and am stuck in getting and parsing HTML/XML. In Python, I usually write the following code when I do web scraping:

from urllib.request import urlopen, Request
url = "http://stackoverflow.com/"
req = Request(url)
html = urlopen(req).read()

, then I can get raw HTML/XML in a form of either string or bytes and proceed to work with it. In Go, how can I cope with it? What I hope to get is raw HTML data which is stored either in string or []byte (though it can be easily converted, that I don't mind which to get at all). I consider using gokogiri package to do web scraping in Go (not sure I'll indeed end up with using it!), but it looks like it requires raw HTML text before doing any work with it...

So how can I acquire such object?

Or is there any better way to do web scraping work in Go?

Thanks.

  • 写回答

1条回答 默认 最新

  • drvjlec1767 2013-09-03 03:50
    关注

    From the Go http.Get Example:

    package main
    
    import (
        "fmt"
        "io/ioutil"
        "log"
        "net/http"
    )
    
    func main() {
        res, err := http.Get("http://www.google.com/robots.txt")
        if err != nil {
            log.Fatal(err)
        }
        robots, err := ioutil.ReadAll(res.Body)
        res.Body.Close()
        if err != nil {
            log.Fatal(err)
        }
        fmt.Printf("%s", robots)
    }
    

    Will return the contents of http://www.google.com/robots.txt into the string variable robots.

    For XML parsing look into the Go encoding/xml package.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥60 pb数据库修改或者求完整pb库存系统,需为pb自带数据库
  • ¥15 spss统计中二分类变量和有序变量的相关性分析可以用kendall相关分析吗?
  • ¥15 拟通过pc下指令到安卓系统,如果追求响应速度,尽可能无延迟,是不是用安卓模拟器会优于实体的安卓手机?如果是,可以快多少毫秒?
  • ¥20 神经网络Sequential name=sequential, built=False
  • ¥16 Qphython 用xlrd读取excel报错
  • ¥15 单片机学习顺序问题!!
  • ¥15 ikuai客户端多拨vpn,重启总是有个别重拨不上
  • ¥20 关于#anlogic#sdram#的问题,如何解决?(关键词-performance)
  • ¥15 相敏解调 matlab
  • ¥15 求lingo代码和思路