dpj775835868 2018-07-02 17:13
浏览 118
已采纳

Golang四叉树的递归并发

I'm trying to parallelize my quadtree lookup by using go-routines to share the recursive lookup.

My Quadtree struct looks like this:

type Quadtree struct {
    Rectangle //Boundary of the quadtree
    Points   []Point
    Ne       *Quadtree
    Se       *Quadtree
    Sw       *Quadtree
    Nw       *Quadtree
    Divided  bool
    Capacity int
    Parent   *Quadtree
}

I have a method called QueryForNearestPoints which accepts a Point and a searchRadius and returns all the points within the searcharea in the Quadtree. If no points are found within the given searchRadius, then the searchRadius is increased and tried again.

points is the channel to which points in the search-radius are send to. I'm also using a sync.WaitGroup wg to wait for all go-routines to finish.

func (q *Quadtree) QueryForNearestPoints(p Point, searchRadius float64) []QueryResult {
    searcharea = Circle{p, searchRadius}
    var testResults []QueryResult

    for {

        go func() {
        loop:
            for {
                select {
                case item, _ := <-points:
                    testResults = append(testResults, item)
                case <-done:
                    break loop
                }
            }
        }()
        wg.Add(1)
        go q.query(searcharea)
        wg.Wait()
        done <- true
        if len(testResults) > 0 {
            break
        } else {
            // Proportionally increase the search radius if no points are found.
            searcharea = Circle{p, searcharea.Radius + searcharea.Radius*50/100}
        }
    }
    return testResults
}

The parallelized quadtree's query method:

func (q *Quadtree) query(area Circle) {
    defer wg.Done()
    if !q.overlapsCircle(area) {
        return
    }
    if len(q.Points) > 0 {
        for i, point := range q.Points {
            if area.contains(point) {
                points <- QueryResult{point, i, q}
            }
        }
    }
    if q.Divided {
        wg.Add(4)
        go q.Ne.parallelquery(area)
        go q.Se.parallelquery(area)
        go q.Sw.parallelquery(area)
        go q.Nw.parallelquery(area)
    }
}

(q *Quadtree) parallelquery method:

func (q *Quadtree) parallelquery(area Circle) {
    semaphore <- struct{}{}
    defer func() {
        <-semaphore
    }()
    q.query(area)
}

If I'm not wrong this is a cpu-bound workload so having 1000's of go-routines isn't going to help, so I'm using a buffered channel semaphore as a counting semaphore to make sure that at any point only n number of go-routines are active.(Here I made n as 4 because my computer has 4 cpus.)

The only problem is that this approach is significantly slower than the normal non-concurrent sequential approach. What am I missing here? How can i make this faster?

  • 写回答

1条回答 默认 最新

  • duanbinian2243 2018-07-03 07:29
    关注

    This is much slower because the work that you are doing in each of the goroutines is very cheap, and there are very many goroutines.

    For those that will criticise the terminology in this answer, I am going to use parallelism and concurrency interchangeably. Even though strictly, parallelism is a property of the runtime / computer, and concurrency is a property of the code. Some code can still be concurrent if it is running on hardware with a single processor, but it will not be running in parallel.

    The reason that the concurrent version here is slower is because there is still some synchronisation required when using channels etc. Under the hood, channels use a mutex to synchronise the sends / receives on a channel. This means that you are introducing contention between the goroutines.

    Each time you call parallelquery(area) you are starting 4 more goroutines that must be scheduled, but (using a semaphore channel) you've limited the number of concurrent goroutines to 4.

    After a n recursions you've got 4^n goroutines all trying to receive a value from the semaphore channel, this causes a lot of contention.

    Using a blocking CPU profiler on this you will probably find that most of the execution time is spent on contention attempting to get a token from the semaphore.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥88 实在没有想法,需要个思路
  • ¥15 MATLAB报错输入参数太多
  • ¥15 python中合并修改日期相同的CSV文件并按照修改日期的名字命名文件
  • ¥15 有赏,i卡绘世画不出
  • ¥15 如何用stata画出文献中常见的安慰剂检验图
  • ¥15 c语言链表结构体数据插入
  • ¥40 使用MATLAB解答线性代数问题
  • ¥15 COCOS的问题COCOS的问题
  • ¥15 FPGA-SRIO初始化失败
  • ¥15 MapReduce实现倒排索引失败