derek5. 2015-05-06 15:32 采纳率: 100%
浏览 353
已采纳

为什么"1000000000000000 in range (1000000000000001)"在 python3中运行得这么快?

It is my understanding that the range() function, which is actually an object type in Python 3, generates its contents on the fly, similar to a generator.

This being the case, I would have expected the following line to take an inordinate amount of time, because in order to determine whether 1 quadrillion is in the range, a quadrillion values would have to be generated:

1000000000000000 in range(1000000000000001)

Furthermore: it seems that no matter how many zeroes I add on, the calculation more or less takes the same amount of time (basically instantaneous).

I have also tried things like this, but the calculation is still almost instant:

1000000000000000000000 in range(0,1000000000000000000001,10) # count by tens

If I try to implement my own range function, the result is not so nice!!

def my_crappy_range(N):
    i = 0
    while i < N:
        yield i
        i += 1
    return

What is the range() object doing under the hood that makes it so fast?


Martijn Pieters' answer was chosen for its completeness, but also see abarnert's first answer for a good discussion of what it means for range to be a full-fledged sequence in Python 3, and some information/warning regarding potential inconsistency for __contains__ function optimization across Python implementations. abarnert's other answer goes into some more detail and provides links for those interested in the history behind the optimization in Python 3 (and lack of optimization of xrange in Python 2). Answers by poke and by wim provide the relevant C source code and explanations for those who are interested.

转载于:https://stackoverflow.com/questions/30081275/why-is-1000000000000000-in-range1000000000000001-so-fast-in-python-3

  • 写回答

6条回答 默认 最新

  • 喵-见缝插针 2015-05-06 15:33
    关注

    The Python 3 range() object doesn't produce numbers immediately; it is a smart sequence object that produces numbers on demand. All it contains is your start, stop and step values, then as you iterate over the object the next integer is calculated each iteration.

    The object also implements the object.__contains__ hook, and calculates if your number is part of its range. Calculating is a O(1) constant time operation. There is never a need to scan through all possible integers in the range.

    From the range() object documentation:

    The advantage of the range type over a regular list or tuple is that a range object will always take the same (small) amount of memory, no matter the size of the range it represents (as it only stores the start, stop and step values, calculating individual items and subranges as needed).

    So at a minimum, your range() object would do:

    class my_range(object):
        def __init__(self, start, stop=None, step=1):
            if stop is None:
                start, stop = 0, start
            self.start, self.stop, self.step = start, stop, step
            if step < 0:
                lo, hi = stop, start
            else:
                lo, hi = start, stop
            self.length = ((hi - lo - 1) // abs(step)) + 1
    
        def __iter__(self):
            current = self.start
            if self.step < 0:
                while current > self.stop:
                    yield current
                    current += self.step
            else:
                while current < self.stop:
                    yield current
                    current += self.step
    
        def __len__(self):
            return self.length
    
        def __getitem__(self, i):
            if i < 0:
                i += self.length
            if 0 <= i < self.length:
                return self.start + i * self.step
            raise IndexError('Index out of range: {}'.format(i))
    
        def __contains__(self, num):
            if self.step < 0:
                if not (self.stop < num <= self.start):
                    return False
            else:
                if not (self.start <= num < self.stop):
                    return False
            return (num - self.start) % self.step == 0
    

    This is still missing several things that a real range() supports (such as the .index() or .count() methods, hashing, equality testing, or slicing), but should give you an idea.

    I also simplified the __contains__ implementation to only focus on integer tests; if you give a real range() object a non-integer value (including subclasses of int), a slow scan is initiated to see if there is a match, just as if you use a containment test against a list of all the contained values. This was done to continue to support other numeric types that just happen to support equality testing with integers but are not expected to support integer arithmetic as well. See the original Python issue that implemented the containment test.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(5条)

报告相同问题?

悬赏问题

  • ¥30 软件自定义无线电该怎样使用
  • ¥15 Jenkins+k8s部署slave节点offline
  • ¥15 微信小游戏反编译后,出现找不到分包的情况
  • ¥15 如何实现从tello无人机上获取实时传输的视频流,然后将获取的视频通过yolov5进行检测
  • ¥15 WPF使用Canvas绘制矢量图问题
  • ¥15 用三极管设计一个单管共射放大电路
  • ¥15 孟德尔随机化r语言运行问题
  • ¥15 pyinstaller编译的时候出现No module named 'imp'
  • ¥15 nirs_kit中打码怎么看(打码文件是csv格式)
  • ¥15 怎么把多于硬盘空间放到根目录下