Looking for a way to simply get the text of a web page, preferably without having to resort to a bunch of regular expressions.
Just thought I'd check first in case this kind of thing is already built in, or at least easier to do in Go.
You could use go-query. This lib can be used like jquery to grep text and doc elements from a html document.
This example is taken from the github page:
package main
import (
"fmt"
"github.com/PuerkitoBio/goquery"
"log"
)
func ExampleScrape() {
doc, err := goquery.NewDocument("http://metalsucks.net")
if err != nil {
log.Fatal(err)
}
doc.Find(".reviews-wrap article .review-rhs").Each(func(i int, s *goquery.Selection) {
band := s.Find("h3").Text()
title := s.Find("i").Text()
fmt.Printf("Review %d: %s - %s
", i, band, title)
})
}
func main() {
ExampleScrape()
}