dongyoufo5672 2017-10-14 13:02
浏览 11

在Go中同时读取文件的最佳方式

So I have a file like this:

NAME : a280
COMMENT : drilling problem (Ludwig)
TYPE : TSP
DIMENSION: 280
EDGE_WEIGHT_TYPE : EUC_2D
NODE_COORD_SECTION
  1 288 149
  2 288 129
  3 270 133
  4 256 141
  5 256 157
  6 246 157
  7 236 169
  8 228 169
  9 228 161
 10 220 169
 11 212 169
 12 204 169
 13 196 169
 14 188 169
 15 196 161

and so on...

The numbers are the cords for cities to solve TSP. I am trying to write this in Golang. Now the instances can be like 200 cities, or even 40.000 cities. I want to get the best possible solution, so I thought I should process this file concurrently. I've got following code:

package main

import (
    "bufio"
    "fmt"
    "os"
    "regexp"
    "strings"
)

func getCords(queue chan string) {
    cords := regexp.MustCompile(`\s*\d+\s+\d+\s+\d+`)
    for line := range queue {
        if cords.MatchString(line) {
            fmt.Println(line)
        }
    }
}

func readFile(fileName string) {
    cords := make(chan string)
    file := strings.NewReader(fileName)
    go func() {
        scanner := bufio.NewScanner(file)
        for scanner.Scan() {
            cords <- scanner.Text()
        }
        close(cords)
    }()
}

// Menu - main program menu
func Menu() {
    reader := bufio.NewReader(os.Stdin)
    fmt.Println("==================    Projektowanie efektywnych algorytmów    ==================")
    fmt.Println("==================    Zadanie nr 1 - Algorytm xyz             ==================")
    // Wczytywanie pliku z danymi
    // Format: Lp. X Y
    fmt.Printf("
Podaj nazwę pliku: ")
    fileName, err := reader.ReadString('
')
    if err != nil {
        fmt.Println(err)
        return
    }
    readFile(fileName)
}

func main() {
    Menu()
}

In function getCords I need to useregex, cuz the files tend to have information part at the beginning.

The problem starts in readFile(). I launch a goroutine, which scans the file line by line and gets all the lines to the channel. Of course the execution just launches it and goes futher. Now the problem is, after the go func() call, I would have to try read from the channel. The solutions I found on SO, and on the internet were as following:

func readFile(fileName string) {
    cords := make(chan string)
    file := strings.NewReader(fileName)
    go func() {
        scanner := bufio.NewScanner(file)
        for scanner.Scan() {
            cords <- scanner.Text()
        }
        close(cords)
    }()
    for i := 0; i < 100; i++ {
        go getCords(cords)
    }
}

So te first execution of getCords would probably even do nothing, because the goroutine would not manage to get the line to the channel that fast. Next iterations would probably do the job, but the problem is I have to write some number, like 100 in this example, and it can be too high, so the channel will get closed in like 10 iterations and after that it's just a waste of time or it can be too low, and then I would just not get all the results.

How do you solve problems like this, guys? Is there optimal way, or do I have to stick with some waitGroups?

  • 写回答

1条回答 默认 最新

  • doubinduo3364 2017-10-14 19:36
    关注

    I think yes, it would be good to use sync.WaitGroup to ensure all goroutines have finished their work. One possible solution:

    func getCords(queue Chas string, wg sync.WaitGroup) {
         defer wg.Done()
         // your code 
     }
    
    func readFile(fileName string) {
        cords := make(chan string)
        file := strings.NewReader(fileName)
    
        go func() {
            scanner := bufio.NewScanner(file)
            for scanner.Scan() {
                cords <- scanner.Text()
            }
            close(cords)
        }()
    
        wg := sync.WaitGroup{}
        for i := 0; i < 100; i++ {
            wg.Add(1)
            go getCords(cords, wg)
        }
        wg.Wait()
    }
    
    评论

报告相同问题?

悬赏问题

  • ¥15 Oracle中如何从clob类型截取特定字符串后面的字符
  • ¥15 想通过pywinauto自动电机应用程序按钮,但是找不到应用程序按钮信息
  • ¥15 MATLAB中streamslice问题
  • ¥15 如何在炒股软件中,爬到我想看的日k线
  • ¥15 51单片机中C语言怎么做到下面类似的功能的函数(相关搜索:c语言)
  • ¥15 seatunnel 怎么配置Elasticsearch
  • ¥15 PSCAD安装问题 ERROR: Visual Studio 2013, 2015, 2017 or 2019 is not found in the system.
  • ¥15 (标签-MATLAB|关键词-多址)
  • ¥15 关于#MATLAB#的问题,如何解决?(相关搜索:信噪比,系统容量)
  • ¥500 52810做蓝牙接受端