This is a stripped-down version of the code I want to use for a page-specific web crawler. The idea is to have a function that gets a URL, deals with HTTP and returns a Reader
to the response body http.Response
:
package main
import (
"io"
"log"
"net/http"
"os"
)
func main() {
const url = "https://xkcd.com/"
r, err := getPageContent(url)
if err != nil {
log.Fatal(err)
}
f, err := os.Create("out.html")
if err != nil {
log.Fatal(err)
}
defer f.Close()
io.Copy(f, r)
}
func getPageContent(url string) (io.Reader, error) {
res, err := http.Get(url)
if err != nil {
return nil, err
}
return res.Body, nil
}
The response body is never closed, which is bad. Closing it inside of the getPageContent
function won't work, of course, for io.Copy
won't be able to read anything from a closed resource.
My question is rather of general interest than for the specific use case: How can I use functions to abstract the gathering of external resources without having to store the whole resource in a temporary buffer? Or should I better avoid such abstractions?