weixin_39781783
weixin_39781783
2020-12-28 06:08

SmartPtr.h: remove unnecessary and thread unsafe assignment.

This commit fixes sporadic crashes during registering or stacking caused by incorrect reference counting in CSmartPtr class.

I've seen these rare crashes when processing a large number of images (100+) on system which has 16 threads. The VisualStudio debugger showed that these crashes were caused by accessing a CMemoryBitmap where corresponding CSmartPtr::m_lRefCount is 0.

Googling "Deep Sky Stacker crash" also gives a few complains about sudden crashes during stacking,

It turned out that CSmartPtr::AddRef is not thread safe: m_lRefCount = InterlockedIncrement(&m_lRefCount); The m_lRefCount assignment is not necessary at all and introduces a race condition. If thread1 executes InterlockedIncrement and then another thread2 increments m_lRefCount before the thread1 executes the assignment, then the thread2 increment will be lost.

The following diff makes the bug 100% reproducible by increasing the race window:


diff --git a/Tools/SmartPtr.h b/Tools/SmartPtr.h
index 60a55ef..172566f 100644
--- a/Tools/SmartPtr.h
+++ b/Tools/SmartPtr.h
@@ -19,7 +19,9 @@ public :
        void    AddRef()
        {
                #if defined(_WINDOWS_)
-               m_lRefCount = InterlockedIncrement(&m_lRefCount);
+               LONG l = InterlockedIncrement(&m_lRefCount);
+               Sleep(1);
+               m_lRefCount = l;
                #else
                m_lRefCount++;
                #endif

Please consider merging the fix. Thanks.

该提问来源于开源项目:deepskystacker/DSS

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

8条回答

  • weixin_39774219 weixin_39774219 4月前

    Hi there (what's your name BTW). I'm pretty sure I understand why this is a problem and why you made the change. I'll ask Tony Cook for a second opinion.

    点赞 评论 复制链接分享
  • weixin_39781783 weixin_39781783 4月前

    Hi perdrix52. My name is Vitali.

    点赞 评论 复制链接分享
  • weixin_39746229 weixin_39746229 4月前

    I'm not understanding the code (example??) shown above as it does not match the actual submitted fixed file. Also the code shown above just repeats the original bug. Is it some test case that increases the frequency of failure to demonstrate the problem more reliably? See ...

    
              m_lRefCount = InterlockedIncrement(&m_lRefCount); 
    

    versus

    
              LONG l = InterlockedIncrement(&m_lRefCount);
              Sleep(1);                                              ???? Whats that for? 
              m_lRefCount = l;                                       ???? Still thread unsafe
    

    However the committed file does have the CORRECT fix. C.f. the actual submitted change is (see https://github.com/LucCoiffier/DSS/pull/44/files#diff-0 )

    
    @@ -19,7 +19,7 @@ public :
        void    AddRef()
        {
            #if defined(_WINDOWS_)
        -   m_lRefCount = InterlockedIncrement(&m_lRefCount);
        +   InterlockedIncrement(&m_lRefCount);
            #else
            m_lRefCount++;
            #endif
    

    I.e. the redundant and wholly thread unsafe external assignment (m_lRefCount = InterlockedIncrement(&m_lRefCount);) has been removed. The assignment is thread unsafe because it lies outside the scope of the thread interlock provided by InterlockedIncrement. InterlockedIncrement internally increments m_lRefCount (a reference argument) within the thread interlock.

    I approve the actual submitted fix on code review alone.

    Tony Cook

    点赞 评论 复制链接分享
  • weixin_39781783 weixin_39781783 4月前

    tonk777, I was probably unclear about the code example from my comment. I said: "The following diff makes the bug 100% reproducible by increasing the race window:"

    Of course the code example (the diff from the comment) is not a fix. Actually it is a way to reproduce the bug more reliable. Just in case if someone would like to reproduce the problem to see for yourself.

    点赞 评论 复制链接分享
  • weixin_39746229 weixin_39746229 4月前

    OK Vitali - now fully understood. That was my interpretation in the end :). Your real fix looks good to me.

    点赞 评论 复制链接分享
  • weixin_39774219 weixin_39774219 4月前

    Exactly – it increases the failure frequency to close to 100%

    Dave

    From: tonk777 [mailto:notifications.com] Sent: 24 October 2018 18:00 To: LucCoiffier/DSS Cc: perdrix52; Comment Subject: Re: [LucCoiffier/DSS] SmartPtr.h: remove unnecessary and thread unsafe assignment. (#44)

    I'm not understanding the code (example??) shown above as it does not match the actual submitted fixed file. Also the code shown above just repeats the original bug. Is it some test case that increases the frequency of failure to demonstrate the problem more reliably? See ...

      m_lRefCount = InterlockedIncrement(&m_lRefCount);
    

    versus

      LONG l = InterlockedIncrement(&m_lRefCount);
      Sleep(1);                                                                        ???? Whats that for? 
      m_lRefCount = l;                                                           ???? Still thread unsafe
    

    However the committed file does have the CORRECT fix. C.f. the actual submitted change is:

    @@ -19,7 +19,7 @@ public : void AddRef() {

    if defined(WINDOWS)

    • m_lRefCount = InterlockedIncrement(&m_lRefCount);
    • InterlockedIncrement(&m_lRefCount);

    else

    m_lRefCount++;

    endif

    I.e. the redundant and wholly thread unsafe external assignment (m_lRefCount = InterlockedIncrement(&m_lRefCount);) has been removed. The assignment is thread unsafe because it lies outside the scope of the thread interlock provided by InterlockedIncrement. InterlockedIncrement internally increments m_lRefCount (a reference argument) within the thread interlock.

    I approve the actual submitted fix on code review alone.

    — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/LucCoiffier/DSS/pull/44#issuecomment-432741710 , or mute the thread https://github.com/notifications/unsubscribe-auth/ATRoMmBx75NJCUTZWUaGAFLocDRXb1sIks5uoJyRgaJpZM4X3b7B . https://github.com/notifications/beacon/ATRoMsZrd7qVG3oyo_xpTtMrNduMucFbks5uoJyRgaJpZM4X3b7B.gif

    点赞 评论 复制链接分享
  • weixin_39774219 weixin_39774219 4月前

    OK following discussion I have merged the update.

    Thank you very much Vitaly, and to Tony for confirming that the fix is correct.

    点赞 评论 复制链接分享
  • weixin_39774219 weixin_39774219 4月前

    Please can you provide some information about yourself (and perhaps update your GitHub profile which currently shows nothing about you). I'd like to invite you to join our "Slack" discussion group.

    点赞 评论 复制链接分享

相关推荐