dongliao1949 2018-02-10 17:53
浏览 31
已采纳

具有交换功能的Golang无锁阵列

I have a two slot array that need to swap between slots when producer set it and always return a valid slot to consumer. As for side of atomic operation logic I can't imagine situation when two goroutine write to the same array slot, but race detector think otherwise. Does anyone can explain me, where is the bug?

type checkConfig struct {
    timeout   time.Time
}

type checkConfigVersions struct {
    config [2]*checkConfig
    reader uint32
    writer uint32
}

func (c *checkConfigVersions) get() *checkConfig {
    return c.config[atomic.LoadUint32(&c.reader)]
}

func (c *checkConfigVersions) set(new *checkConfig) {
    for {
        reader := atomic.LoadUint32(&c.reader)
        writer := atomic.LoadUint32(&c.writer)
        switch diff := reader ^ writer; {
        case diff == 0:
            runtime.Gosched()
        case diff == 1:
            if atomic.CompareAndSwapUint32(&c.writer, writer, (writer+1)&1) {
                c.config[writer] = new
                atomic.StoreUint32(&c.reader, writer)
                return
            }
        }
    }
}

Data race happened on c.config[writer] = new, but for my point of view it's not possible.

fun main() {
    runtime.GOMAXPROCS(runtime.NumCPU())

    var wg sync.WaitGroup

    ccv := &checkConfigVersions{reader: 0, writer: 1}

    for i := 0; i < runtime.NumCPU(); i++ {
        wg.Add(100)
        go func(i int) {
            for j := 0; j < 100; j++ {
                ccv.set(&checkConfig{})
                wg.Done()
            }
        }(i)
    }

    wg.Wait()
    fmt.Println(ccv.get())
}

Data race detector output:

==================
WARNING: DATA RACE
Write at 0x00c42009a020 by goroutine 12:
  main.(*checkConfigVersions).set()
      /Users/apple/Documents/Cyber/Go/proxy/main.go:118 +0xd9
  main.main.func1()
      /Users/apple/Documents/Cyber/Go/proxy/main.go:42 +0x60

Previous write at 0x00c42009a020 by goroutine 11:
  main.(*checkConfigVersions).set()
      /Users/apple/Documents/Cyber/Go/proxy/main.go:118 +0xd9
  main.main.func1()
      /Users/apple/Documents/Cyber/Go/proxy/main.go:42 +0x60

Goroutine 12 (running) created at:
  main.main()
      /Users/apple/Documents/Cyber/Go/proxy/main.go:40 +0x159

Goroutine 11 (running) created at:
  main.main()
      /Users/apple/Documents/Cyber/Go/proxy/main.go:40 +0x159
==================

And if you try to read it with ccv.read(), you catch the another race but between read and write on the same array slot...

  • 写回答

1条回答 默认 最新

  • dtxq82489 2018-02-10 19:59
    关注

    I can change your code to check the just-written c.writer:

    if atomic.CompareAndSwapUint32(&c.writer, writer, (writer+1)&1) {
      newWriter := atomic.LoadUint32(&c.writer)
      if newWriter != (writer+1)&1 {
        panic(fmt.Errorf("wrote %d, but writer is %d", (writer+1)&1, newWriter))
      }
      //c.config[writer] = new
      atomic.StoreUint32(&c.reader, writer)
      return
    }
    

    Manually setting GOMAXPROCS (and the number of goroutines) to just 3, I get the following after running it just a few times:

    $ go run main.go  
    panic: wrote 1, but writer is 0
    
    goroutine 6 [running]:
    main.(*checkConfigVersions).set(0xc42000a060, 0x1, 0xc4200367a0)
        main.go:36 +0x16d
    main.main.func1(0xc42000a060, 0xc420018110, 0x1)
        main.go:58 +0x63
    created by main.main
        main.go:56 +0xd6
    exit status 2
    

    The main reason is that GOMAXPROCS sets the number of OS threads to use. These are not controlled by Go (Go just schedules goroutines onto OS threads). Instead, the OS gets to schedule OS threads however it wants.

    This means that one of the goroutines will CAS(1, 1, 0), then be suspended by the OS. During that time, another goroutine will go through and CAS(0, 0, 1). This allows the third goroutine to perform CAS(1, 1, 0) and continues at the same time as the original goroutine is scheduled for execution again. Bam! Both of them are trying to write to the same config[writer].

    Atomicity is great when you want to avoid a mutex, but should not be used to perform synchronization in this type of case.

    I'm not quite sure what your original scenario is, but I'm fairly confident going with a mutex will save you a lot of pain.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 关于#java#的问题,请各位专家解答!
  • ¥15 急matlab编程仿真二阶震荡系统
  • ¥20 TEC-9的数据通路实验
  • ¥15 ue5 .3之前好好的现在只要是激活关卡就会崩溃
  • ¥50 MATLAB实现圆柱体容器内球形颗粒堆积
  • ¥15 python如何将动态的多个子列表,拼接后进行集合的交集
  • ¥20 vitis-ai量化基于pytorch框架下的yolov5模型
  • ¥15 如何实现H5在QQ平台上的二次分享卡片效果?
  • ¥30 求解达问题(有红包)
  • ¥15 请解包一个pak文件