1

A Python range is an interesting object because it behaves like a generator when it comes to memory, but it otherwise behaves like a sequence and it additionally has O(1) time complexity for in operation, .index() and .count().

This is an example of a dynamic sequence, i.e. a Sequence object that does not store its elements in memory.

How do I implement a dynamic sequence in Python?

The in operation, .index() and .count methods implemented in O(1) time, would, of course, be a nice addition.


For concreteness, let us consider the case of a Geometric sequence:

s_k = a * r ** k

where k is an integer.

Obviously, one cannot simply use generator (simplified) like:

def geom_seq(a, r, start, stop=None):
    if stop is None:
        start, stop = 0, start
    else:
        start, stop = start, stop
    item = a * r ** start
    for _ in range(start, stop):
        yield item
        item *= r

because this, while being memory efficient would not behave like a Sequence, e.g.:

a = geom_seq(1, 2, 8)

len(a)
# TypeError...

list(a)
# [1, 2, 4, 8, 16, 32, 64, 128]
list(a)  # generator is consumed...
# []

On the contrary, something like:

a = tuple(geom_seq(1, 2, 8))

will be(have like) a Sequence but would not be memory efficient:

import sys


print(sys.getsizeof(geom_seq(1, 2, 1000)))
# 88
print(sys.getsizeof(tuple(geom_seq(1, 2, 1000))))
# 8048

EDIT

This is not a duplicate of How to implement a minimal class that behaves like a sequence in Python? as time / memory considerations are not discussed there.

1

1 Answer 1

1

A simplified implementation of a Geometric sequence with the values not stored internally would look like:

import collections.abc
import math


class GeomRange(collections.abc.Sequence):
    def __init__(self, a, r, start, stop=None):
        self.a = a
        self.r = r
        if stop is None:
            start, stop = 0, start
        self.k = range(start, stop, 1)

    def __getitem__(self, i):
        if isinstance(i, int):
            if i < 0:
                i += len(self)
            if i in self.k:
                return self.a * self.r ** (self.k.start + i)
            else:
                raise IndexError
        elif isinstance(i, slice):
            sliced = self.k[i]
            return type(self)(self.a, self.r, sliced.start, sliced.stop)

    def __len__(self):
        return len(self.k)

    def __contains__(self, item):
        r_k = item / self.a
        if r_k > 0:
            k = math.log2(r_k) / math.log2(self.r)
            if math.isclose(k % 1, 1.0):
                k = int(k)
        else:
            k = None
        return k in self.k

    def __iter__(self):
        item = self.a * self.r ** self.k.start
        for _ in self.k:
            yield item
            item *= self.r

    def __reversed__(self):
        item = self.a * self.r ** (self.k.stop - 1)
        for _ in reversed(self.k):
            yield item
            item //= self.r

    def index(self, item):
        r_k = item / self.a
        if r_k > 0:
            k = math.log2(r_k) / math.log2(self.r)
            if math.isclose(k % 1, 1.0):
                k = int(k)
        else:
            k = None
        return self.k.index(k)


    def count(self, item):
        if item in self:
            return 1
        else:
            raise ValueError

    def __str__(self):
        return f'Geom[{self.a}, {self.r}]-{self.k}'

    __repr__ = __str__

Note that the above code does not necessarily handle well all corner cases. For example, it assumes a and r to be non-negative ints.

This object would behave essentially like range(), but would produce a Geometric progression:

a = GeomRange(1, 2, 8)
print(a)
# Geom[1, 2]-range(0, 8)
print(len(a))
# 8

print(list(a))
# [1, 2, 4, 8, 16, 32, 64, 128]
print(list(a))
# [1, 2, 4, 8, 16, 32, 64, 128]
print(list(reversed(a)))
# [128, 64, 32, 16, 8, 4, 2, 1]

print(a[2], a[-2])
# 4 64

print((a[:2]), (a[2:]))
# Geom[1, 2]-range(0, 2) Geom[1, 2]-range(2, 8)
print((a[:-2]), (a[-2:]))
# Geom[1, 2]-range(0, 6) Geom[1, 2]-range(6, 8)
print(list(a[:2]), list(a[2:]))
# [1, 2] [4, 8, 16, 32, 64, 128]
print(list(a[:-2]), list(a[-2:]))
# [1, 2, 4, 8, 16, 32] [64, 128]

print((a[:10]), (a[10:]))
# Geom[1, 2]-range(0, 8) Geom[1, 2]-range(8, 8)
print((a[:-10]), (a[-10:]))
# Geom[1, 2]-range(0, 0) Geom[1, 2]-range(0, 8)
print(list(a[:10]), list(a[10:]))
# [1, 2, 4, 8, 16, 32, 64, 128] []
print(list(a[:-10]), list(a[-10:]))
# [] [1, 2, 4, 8, 16, 32, 64, 128]

print(a.index(4))
# 2
print(a.count(8))
# 1

print(all((
    0 not in a,
    1 in a,
    128 in a,
    256 not in a,
    2 in a,
    3 not in a,
    1.99999999 not in a)))
# True

with O(1) operations when possible, e.g.:

%timeit 2 ** 100 in GeomRange(1, 2, 1000)
# 100000 loops, best of 3: 4.7 µs per loop
%timeit 2 ** 100 in GeomRange(1, 2, 1000000)
# 100000 loops, best of 3: 4.68 µs per loop

while still being memory efficient, e.g.:

import sys


print(sys.getsizeof(GeomRange(1, 2, 1000)))
# 56
print(sys.getsizeof(tuple(GeomRange(1, 2, 1000))))
# 8048

The speed price for all this machinery is a few percent when iterating, e.g.:

%timeit tuple(GeomRange(1, 2, 10000))
# 100 loops, best of 3: 3.47 ms per loop
%timeit tuple(geom_seq(1, 2, 10000))
# 100 loops, best of 3: 3.18 ms per loop
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.