dqnhfbc3738 2016-07-19 00:03
浏览 66
已采纳

使用Golang中的curl在网站上对字符串进行grep操作最有效和可扩展的方法是什么?

Background

user@host curl -s http://stackoverflow.com | grep -m 1 stackoverflow.com

returns immediately if the string is found:

<meta name="twitter:domain" content="stackoverflow.com"/>

Aim

find a string on a website using Golang

Method

Based on sources from Go by Example and Schier's Blog the following code was created:

package main

import (
    "fmt"
    "io/ioutil"
    "net/http"
    "regexp"
)

func main() {
    url := "http://stackoverflow.com"
    resp, _ := http.Get(url)
    bytes, _ := ioutil.ReadAll(resp.Body)
    r, _ := regexp.Compile("stackoverflow.com")
    fmt.Println(r.FindString(string(bytes)))
    resp.Body.Close()
}

Results

Running the code results in:

stackoverflow.com

Discussion & Conclusions

  1. More code is required to achieve the same aim in Golang or is there a shorter solution
  2. Both options seems to return at the same time. Is static code in this case faster than dynamic code as well?
  3. I am concerned whether this code consumes too much memory. It should be used eventually to monitor hundreds of different websites
  • 写回答

1条回答 默认 最新

  • dongqiangse6623 2016-07-19 04:19
    关注

    This code implements grep, stopping at the first line that contains the given string. It avoids reading the entire webpage into memory at once by using a bufio.Scanner, which apart from bounding the memory use might also speed up the program in the case where the string is found near the start of a huge page. It's careful to use scan.Bytes() to avoid converting every line into a string, which would cause significant memory churn.

    package main
    
    import (
        "bufio"
        "bytes"
        "fmt"
        "log"
        "net/http"
    )
    
    func main() {
        resp, err := http.Get("http://stackoverflow.com")
        if err != nil {
            log.Fatalf("failed to open url")
        }
        scan := bufio.NewScanner(resp.Body)
        toFind := []byte("stackoverflow.com")
        defer resp.Body.Close()
        for scan.Scan() {
            if bytes.Contains(scan.Bytes(), toFind) {
                fmt.Println(scan.Text())
                return
            }
        }
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 delta降尺度计算的一些细节,有偿
  • ¥15 Arduino红外遥控代码有问题
  • ¥15 数值计算离散正交多项式
  • ¥30 数值计算均差系数编程
  • ¥15 redis-full-check比较 两个集群的数据出错
  • ¥15 Matlab编程问题
  • ¥15 训练的多模态特征融合模型准确度很低怎么办
  • ¥15 kylin启动报错log4j类冲突
  • ¥15 超声波模块测距控制点灯,灯的闪烁很不稳定,经过调试发现测的距离偏大
  • ¥15 import arcpy出现importing _arcgisscripting 找不到相关程序