dqnhfbc3738 2016-07-19 00:03
浏览 66
已采纳

使用Golang中的curl在网站上对字符串进行grep操作最有效和可扩展的方法是什么?

Background

user@host curl -s http://stackoverflow.com | grep -m 1 stackoverflow.com

returns immediately if the string is found:

<meta name="twitter:domain" content="stackoverflow.com"/>

Aim

find a string on a website using Golang

Method

Based on sources from Go by Example and Schier's Blog the following code was created:

package main

import (
    "fmt"
    "io/ioutil"
    "net/http"
    "regexp"
)

func main() {
    url := "http://stackoverflow.com"
    resp, _ := http.Get(url)
    bytes, _ := ioutil.ReadAll(resp.Body)
    r, _ := regexp.Compile("stackoverflow.com")
    fmt.Println(r.FindString(string(bytes)))
    resp.Body.Close()
}

Results

Running the code results in:

stackoverflow.com

Discussion & Conclusions

  1. More code is required to achieve the same aim in Golang or is there a shorter solution
  2. Both options seems to return at the same time. Is static code in this case faster than dynamic code as well?
  3. I am concerned whether this code consumes too much memory. It should be used eventually to monitor hundreds of different websites
  • 写回答

1条回答 默认 最新

  • dongqiangse6623 2016-07-19 04:19
    关注

    This code implements grep, stopping at the first line that contains the given string. It avoids reading the entire webpage into memory at once by using a bufio.Scanner, which apart from bounding the memory use might also speed up the program in the case where the string is found near the start of a huge page. It's careful to use scan.Bytes() to avoid converting every line into a string, which would cause significant memory churn.

    package main
    
    import (
        "bufio"
        "bytes"
        "fmt"
        "log"
        "net/http"
    )
    
    func main() {
        resp, err := http.Get("http://stackoverflow.com")
        if err != nil {
            log.Fatalf("failed to open url")
        }
        scan := bufio.NewScanner(resp.Body)
        toFind := []byte("stackoverflow.com")
        defer resp.Body.Close()
        for scan.Scan() {
            if bytes.Contains(scan.Bytes(), toFind) {
                fmt.Println(scan.Text())
                return
            }
        }
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 metadata提取的PDF元数据,如何转换为一个Excel
  • ¥15 关于arduino编程toCharArray()函数的使用
  • ¥100 vc++混合CEF采用CLR方式编译报错
  • ¥15 coze 的插件输入飞书多维表格 app_token 后一直显示错误,如何解决?
  • ¥15 vite+vue3+plyr播放本地public文件夹下视频无法加载
  • ¥15 c#逐行读取txt文本,但是每一行里面数据之间空格数量不同
  • ¥50 如何openEuler 22.03上安装配置drbd
  • ¥20 ING91680C BLE5.3 芯片怎么实现串口收发数据
  • ¥15 无线连接树莓派,无法执行update,如何解决?(相关搜索:软件下载)
  • ¥15 Windows11, backspace, enter, space键失灵