I am trying to process lines from a file concurrently, but for some reason I appear to be getting inconsistent results. A simplified version of my code is below:
var wg sync.WaitGroup
semaphore := make(chan struct{}, 2)
lengths:= []int{}
for _, file := range(args[1:]){
// Open the file and start reading it
reader, err := os.Open(file)
if err != nil {
fmt.Println("Problem reading input file:", file)
fmt.Println("Error:", err)
os.Exit(0)
}
scanner := bufio.NewScanner(reader)
// Start streaming lines
for scanner.Scan() {
wg.Add(1)
text := scanner.Text()
semaphore <- struct{}{}
go func(line string) {
length := getInformation(line)
lengths = append(lengths, length)
<-semaphore
wg.Done()
}(text)
}
}
wg.Wait()
sort.Ints(lengths)
fmt.Println("Lengths:", lengths)
The getInformation
function is just returning the length of the line. I then take that line and add it to an array. The issue I'm having is that when I run this multiple times against the same file I get different number of items in my array. I had assumed that since I was using a waitGroup
that all lines would be processed every time and therefore the contents of lengths
would be the same, but this does not appear to be the case. Can anyone see what I am doing wrong here?