weixin_39548776
2021-01-10 14:57 阅读 2

threshold_otsu and threshold_multiotsu give different results

Description

thank you very much for implementing threshold_multiotsu, it is super helpful for my project. However, I found some inconsistency, using threshold_otsu and threshold_multiotsu with two classes produces different results.

Way to reproduce

python
import numpy as np
from skimage.filters import threshold_otsu, threshold_multiotsu

array = np.array([-730.,  371.,  485.,  514.,  640.,  687.,  699.,  825.,  828., 1025., 1082., 1129., 1233.])
thresh = threshold_multiotsu(array, 2)
print(f"Multiotsu: {thresh[0]}")
# 830.43164062

thresh = threshold_otsu(array)
print(f"Singleotsu: {thresh}")
# -726.166015625

Version information

python
# Paste the output of the following python commands
from __future__ import print_function
import sys; print(sys.version)
import platform; print(platform.platform())
import skimage; print("scikit-image version: {}".format(skimage.__version__))
import numpy; print("numpy version: {}".format(numpy.__version__))
python
3.6.3 (v3.6.3:2c5fed8, Oct  3 2017, 18:11:49) [MSC v.1900 64 bit (AMD64)]
Windows-10-10.0.17134-SP0
scikit-image version: 0.16.dev0
numpy version: 1.16.4

该提问来源于开源项目:scikit-image/scikit-image

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享

16条回答 默认 最新

  • weixin_39605326 weixin_39605326 2021-01-10 14:57

    Hi , thank you very much for your bug report. I am trying to identify what could cause this effect, and I'll get back to you later, OK?

    点赞 评论 复制链接分享
  • weixin_39927508 weixin_39927508 2021-01-10 14:57

    With our images, I found two that replicate the issue

    
    names = ('camera', 'moon', 'coins', 'text', 'clock', 'page')
    
    for name in names:
        img = getattr(data, name)()
        otsu = threshold_otsu(img)
        multi = threshold_multiotsu(img, 2)[0]
        if otsu != multi:
            print(name, otsu, multi)
    

    moon 87 89

    
    names = ('chelsea', 'coffee', 'astronaut', 'rocket')
    
    from skimage.color import rgb2gray
    
    for name in names:
        img = rgb2gray(getattr(data, name)())
        otsu = threshold_otsu(img)
        multi = threshold_multiotsu(img, 2)[0]
        if otsu != multi:
            print(name, otsu, multi)
    

    astronaut 0.388671875 0.439453125

    点赞 评论 复制链接分享
  • weixin_39859954 weixin_39859954 2021-01-10 14:57

    When developing speedups for otsu, I found that there often exists two points that give the same result mathematically. The one that would be chosen would be implementation specific. I had to corner case a few of the points in the test. I think I left a link to a gist in the otsu test that made things more explicit.

    点赞 评论 复制链接分享
  • weixin_39927508 weixin_39927508 2021-01-10 14:57

    All images pass if I call skimage.util.invert() prior thresholding, which seems to comfort your observation

    点赞 评论 复制链接分享
  • weixin_39927508 weixin_39927508 2021-01-10 14:57

    Actually, it could be a nice note to say that the uniqueness of the threshold (likely also true for some other algorithms) can be checked by inverting the image.

    点赞 评论 复制链接分享
  • weixin_40008339 weixin_40008339 2021-01-10 14:57

    I proposed in #4167 to deprecate the actual implementation of threshold_otsu and make it call theshold_multiotsu with classes=2 instead. I based my suggestion on the comment of that suggests that both results are valid:

    I found that there often exists two points that give the same result mathematically. The one that would be chosen would be implementation specific. I had to corner case a few of the points in the test.

    and the fact that the refactoring of threshold_multiotsu gives now similar performances for both algorithms:

    python
    In [1]: from skimage.filters import threshold_multiotsu, threshold_otsu                                 
    
    In [2]: from skimage.data import moon                                                                   
    
    In [3]: img = moon()                              
    
    In [4]: %timeit threshold_otsu(img)                                                                     
    1.21 ms ± 444 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    
    In [5]: %timeit threshold_multiotsu(img, 2)                                                             
    2.07 ms ± 3.44 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
    
    点赞 评论 复制链接分享
  • weixin_39531594 weixin_39531594 2021-01-10 14:57

    unfortunately, backwards compatibility is very important in scikit-image, so changing the results given by Otsu, probably our most popular thresholding algorithm, would be a mess.

    点赞 评论 复制链接分享
  • weixin_40008339 weixin_40008339 2021-01-10 14:57

    The backward compatibility argument sounds strange to me: suppose that you come up with a much faster algorithm performing Otsu thresholding, providing valid results, but in some cases different from the actual implementation, will you not consider it for replacement? :slightly_smiling_face: BTW, this means that the only chance to solve this issue is to make multi-Otsu reproduce Otsu results... Even if its solution is valid... The other option is to consider this issue not a bug but an acceptable behavior.

    点赞 评论 复制链接分享
  • weixin_39531594 weixin_39531594 2021-01-10 14:57

    suppose that you come up with a much faster algorithm performing Otsu thresholding, providing valid results, but in some cases different from the actual implementation, will you not consider it for replacement? 🙂

    Yes, we will consider it. We actually break backwards compatibility all the time. But it must provide significant value for the users. In this case, the only advantage is to us in slightly reducing the maintenance burden. But it's a very minor advantage.

    The other option is to consider this issue not a bug but an acceptable behavior.

    This is perfectly valid, since both are valid implementations. In a way we are giving users some extra options. =)

    点赞 评论 复制链接分享
  • weixin_39605326 weixin_39605326 2021-01-10 14:57

    The other option is to consider this issue not a bug but an acceptable behavior.

    Not a bug, a feature! :sweat_smile:

    suppose that you come up with a much faster algorithm performing Otsu thresholding, providing valid results, but in some cases different from the actual implementation, will you not consider it for replacement?

    as the original writer of the algorithm, I don't trust it enough yet to substitute Otsu. I had it tested for some situations, but in the wild it's showing some unexpected behavior... I want to figure out how to have the optimal algorithm first, then we can think on bold stuff like that :smile:

    点赞 评论 复制链接分享
  • weixin_40008339 weixin_40008339 2021-01-10 14:57

    I don't trust it enough yet to substitute Otsu. I had it tested for some situations, but in the wild it's showing some unexpected behavior...

    Here is the real issue! We don't really care about making multi-Otsu providing the same results as Otsu... Let's expose these unexpected behavior and try to solve them in an another issue :wink:

    点赞 评论 复制链接分享
  • weixin_39859954 weixin_39859954 2021-01-10 14:57

    backward compatibility was broken in a PR long ago https://github.com/scikit-image/scikit-image/pull/3504/files

    , 2ms vs 1.2 ms is a big difference in my mind. You did acheive 60% similarity between performance of both algorithms. I think you could likely close the gap a little more by studying the code in the PR #3504 maybe.... not too sure.

    Note that the speedup a factor of 4x because a division and the use of floating point numbers in certain cases is removed.

    点赞 评论 复制链接分享
  • weixin_39605326 weixin_39605326 2021-01-10 14:57

    Here is the real issue! (...) Let's expose these unexpected behavior and try to solve them in an another issue :wink:

    The behavior is already exposed in this issue and the other related (#3975), I'm not hiding anything... I just don't know how to fix it yet :sweat: :sweat_smile:

    We don't really care about making multi-Otsu providing the same results as Otsu...

    As a matter of fact, I really care that the Otsu algorithms have at least a nice precision between them. They can perform different, but the end result should be accurate.

    点赞 评论 复制链接分享
  • weixin_40008339 weixin_40008339 2021-01-10 14:57

    My last comment was absolutely not criticism, but enthusiasm :smile: I hope you did not took it wrong . I am diving in the references trying to better understand the algorithm, I will do my best ;-)

    点赞 评论 复制链接分享
  • weixin_39605326 weixin_39605326 2021-01-10 14:57

    Don't worry , I'm just making it clear that my point is to maintain both versions around. Then we can use Otsu to compare our results until MultiOtsu is up and running for every case :)

    点赞 评论 复制链接分享
  • weixin_40008339 weixin_40008339 2021-01-10 14:57

    I refactored threshold_multiotsu in #4178 and it matches now the results of threshold_otsu on skimage data! The implementation is now faster then threshold_otsu:

    python
    In [1]: from skimage.filters import threshold_multiotsu, threshold_otsu                                  
    
    In [2]: from skimage.data import moon                                                                    
    
    In [3]: img = moon()                                                                                     
    
    In [4]: %timeit threshold_otsu(img)                                                                      
    1.21 ms ± 385 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    
    In [5]: %timeit threshold_multiotsu(img, 2)                                                              
    905 µs ± 2.06 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    
    In [6]: %timeit threshold_multiotsu(img, 3)                                                              
    1.8 ms ± 856 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
    
    In [7]: %timeit threshold_multiotsu(img, 4)                                                              
    83.4 ms ± 22.7 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
    
    In [8]: %timeit threshold_multiotsu(img, 5)                                                              
    5.08 s ± 20.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    

    And the result for use case is now -718.49804688 which corresponds to the value at index 1 of the bins_centers while threshold_otsu outputs -726.166015625 which corresponds to the index 0 value...

    点赞 评论 复制链接分享

相关推荐