so there are lots of resources online about concurrency patterns in go -- those three I got from a quick google search. But if you have something specific in mind, I think I can address that too.
Looks like you want to crawl a website and get information from it's many pages concurrently, depositing that "information" into a common location (ie. a slice
). The way to go here is to use a chan
, chaonlinennel, which is a thread-safe (multiple threads can access it without fear) data-structure for channeling data from one thread/goroutine to another.
And of course the go
keyword in Go is how to spawn a goroutine.
so for example, in a func main()
thread:
// get a listOfWebpages
dataChannel := make(chan string)
for _, webpage := range listOfWebpages {
go fetchDataFromWebpage(webpage, dataChannel)
}
// the dataChannel will be concurrently filled with the data you send to it
for x := range dataChannel {
fmt.Println(x) // print the header or whatever you scraped from webpage
}
The goroutines will be functions which scrape websites and feed the dataChannel
(you mentioned you know how to scrape websites already). Something like this:
func fetchDataFromWebpage(url string, c chan string) {
data := scrapeWebsite(url)
c <- data // send the data to thread safe channel
}
If your having trouble understanding how to use concurrent tools, such as channels, mutex locks, or WaitGroup
s -- maybe you should start by trying to understand why concurrency can be problematic :) I find the best illustration of that (to me) is the Dining Philosophers Problem, https://en.wikipedia.org/wiki/Dining_philosophers_problem
Five silent philosophers sit at a round table with bowls of spaghetti. Forks are placed between each pair of adjacent philosophers.
Each philosopher must alternately think and eat. However, a philosopher can only eat spaghetti when they have both left and right forks. Each fork can be held by only one philosopher and so a philosopher can use the fork only if it is not being used by another philosopher. After an individual philosopher finishes eating, they need to put down both forks so that the forks become available to others. A philosopher can take the fork on their right or the one on their left as they become available, but cannot start eating before getting both forks.
If practice is what you're looking for, I recommend implementing this problem, so that it fails, and then trying to fix it using concurrent patterns :) -- there are other problems like this available to! And creating the problem is one step towards understanding how to solve it!
If you're having more trouble just understanding how to use Channels, aside from reading up on it, you can more simply think about channels as queues which can safely be accessed/modified from concurrent threads.