I am trying to parse XML via the sitemap, and then loop over the address to get the details of the post in Go. But I am getting this weird error:
: first path segment in URL cannot contain colon
This is the code snippet:
type SitemapIndex struct {
Locations []Location `xml:"sitemap"`
}
type Location struct {
Loc string `xml:"loc"`
}
func (l Location) String() string {
return fmt.Sprintf(l.Loc)
}
func main() {
resp, _ := http.Get("https://www.washingtonpost.com/news-sitemaps/index.xml")
bytes, _ := ioutil.ReadAll(resp.Body)
var s SitemapIndex
xml.Unmarshal(bytes, &s)
for _, Location := range s.Locations {
fmt.Printf("Location: %s", Location.Loc)
resp, err := http.Get(Location.Loc)
fmt.Println("resp", resp)
fmt.Println("err", err)
}
}
And the output:
Location:
https://www.washingtonpost.com/news-sitemaps/politics.xml
resp <nil>
err parse
https://www.washingtonpost.com/news-sitemaps/politics.xml
: first path segment in URL cannot contain colon
Location:
https://www.washingtonpost.com/news-sitemaps/opinions.xml
resp <nil>
err parse
https://www.washingtonpost.com/news-sitemaps/opinions.xml
: first path segment in URL cannot contain colon
...
...
My guess is that the Location.Loc
returns a new line before and after the actuall address.
Eg:
Location: https://www.washingtonpost.com/news-sitemaps/politics.xml
Because hardcoding the URL works as expected:
for _, Location := range s.Locations {
fmt.Printf("Location: %s", Location.Loc)
test := "https://www.washingtonpost.com/news-sitemaps/politics.xml"
resp, err := http.Get(test)
fmt.Println("resp", resp)
fmt.Println("err", err)
}
Output, as you can see the error is nil:
Location:
https://www.washingtonpost.com/news-sitemaps/politics.xml
resp &{200 OK 200 HTTP/2.0 2 0 map[Server:[nginx] Arc-Service:[api] Arc-Org-Name:[washpost] Expires:[Sat, 02 Feb 2019 05:32:38 GMT] Content-Security-Policy:[upgrade-insecure-requests] Arc-Deployment:[washpost] Arc-Organization:[washpost] Cache-Control:[private, max-age=60] Arc-Context:[index] Arc-Application:[Feeds] Vary:[Accept-Encoding] Content-Type:[text/xml; charset=utf-8] Arc-Servername:[api.washpost.arcpublishing.com] Arc-Environment:[index] Arc-Org-Env:[washpost] Arc-Route:[/feeds] Date:[Sat, 02 Feb 2019 05:31:38 GMT]] 0xc000112870 -1 [] false true map[] 0xc00017c200 0xc0000ca370}
err <nil>
Location:
...
...
But I am very new to Go, and so I have no idea what's wrong. Could you please tell me where I am wrong?