Do iterators save memory in Python?

Question

I don't quite understand how iterators have memory in Python.

>>> l1 = [1, 2, 3, 4, 5, 6]
>>> l2 = [2, 3, 4, 5, 6, 7]
>>> iz = izip(l1, l2)

We still require O(min(l1, l2)) memory as we need to load the lists l1 and l2 in memory.

I thought one of the main uses of iterators was to save memory - yet it does not seem to be useful here.

Similarly, the code below is unclear to me:

>>> l1 = ( n for n in [1, 2, 3, 4, 5, 6] )
>>> l2 = ( n for n in [2, 3, 4, 5, 6, 7] )
>>> iz = izip(l1, l2)

We need to load the lists before converting them into generators, right? This means we'll waste memory. So - what is the point of generators here as well.

This is the only case that makes sense to me:

def build_l1():
    for n in xrange(1, 6):
       yield n

def build_l2:
    for n in xrange(2, 7):
       yield n

l1 = build_l1()
l2 = build_l2()
iz = izip(l1, l2)

None of the arrays is being loaded into memory. Hence we're in O(1) memory.

How does the memory usage of the iterator functions in Python work? The first two cases seem to use O(min(l1, l2)) memory. I thought the main point of iterators was to save memory, which makes the first two cases seem useless.

If you iterate over a list, it doesn't save memory. The point is, often you can avoid creating that list in the first place. Also, it doesn't only make sense to save memory when you can save it asymptotically. — L3viathan
– L3viathan, Commented Feb 1, 2016 at 13:26
Your build_l1 and build_l2 don't make much sense, xrange already stores just (from, to, step) — Kos
– Kos, Commented Feb 1, 2016 at 13:28

Andrea Corbellini · Accepted Answer · 2016-02-01 14:45:21Z

15

Your examples are too simplistic. Consider this:

nums = [1, 2, 3, 4, 5, 6]
nums_it = (n for n in nums)

nums_it is a generator that returns all items unmodified from nums. Clearly you do not have any advantage. But consider this:

squares_it = (n ** 2 for n in nums)

and compare it with:

squares_lst = [n ** 2 for n in nums]

With squares_it, we are generating the squares of nums on the fly only when requested. With squares_lst, we are generating all of them at once and storing them in a new list.

So, when you do:

for n in squares_it:
    print(n)

it's like if you were doing:

for n in nums:
    print(n ** 2)

But when you do:

for n in squares_lst:
    print(n)

it's like if you were doing:

squares_lst = []
for n in nums:
    squares_lst.append(n ** 2)
for n in squares_lst:
    print(n)

If you don't need (or don't have) the list nums, then you can save even more space by using:

squares_it = (n ** 2 for n in xrange(1, 7))

Generators and iterators also provide another significant advantage (which may actually be a disadvantage, depending on the situation): they are evaluated lazily.

Also, generators and iterators may yield an infinite number of elements. An example is itertools.count() that yields 0, 1, 2, 3, ... without never ending.

edited Feb 1, 2016 at 14:45

answered Feb 1, 2016 at 13:38

Andrea Corbellini

17.9k3 gold badges58 silver badges71 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

user1008537 Over a year ago

Thanks for the reply. The generator version of squares_lst is O(n) in memory though. I do agree that the constant factor is minimized though. Is there a way to achieve O(1) memory using iterators and a pre-defined list?

Andrea Corbellini Over a year ago

@zero: squares_it is O(1) memory. Or are you considering the memory for nums too?

user1008537 Over a year ago

The entire algorithm to square those numbers would be O(n), if I am correct?

Andrea Corbellini Over a year ago

@zero: yes. But change it to (n ** 2 for n in xrange(1, 7)) and then you have a O(1) space algorithm

chepner Over a year ago

@zero: When considering space complexity, you only consider memory in addition to the input.

|

Duncan · Accepted Answer · 2016-02-01 15:28:07Z

1

>>> l1 = [1, 2, 3, 4, 5, 6]
>>> l2 = [2, 3, 4, 5, 6, 7]
>>> iz = izip(l1, l2)

We still require O(min(l1, l2)) memory as we need to load the lists l1 and l2 in memory.

With zip you need storage for the two original lists plus the zipped list. With izip you don't store the zipped list.

Big O notation isn't particularly helpful here if you have to work with a real physical machine instead of some abstract concept of a machine. There's a hidden constant multiplier on your O(n) calculations that could influence the practicality of the code well before n tends to infinity.

>>> l1 = ( n for n in [1, 2, 3, 4, 5, 6] )
>>> l2 = ( n for n in [2, 3, 4, 5, 6, 7] )
>>> iz = izip(l1, l2)

We need to load the lists before converting them into generators, right? This means we'll waste memory. So - what is the point of generators here as well.

No point to generators here. Any time you see n for n in <expr> without either a more complex expression before the for or an if <expr> filter after it, that's a code smell as you could just have used the original sequence directly. The generators only become useful when you transform the input values into something else or filter them.

edited Feb 1, 2016 at 15:28

answered Feb 1, 2016 at 13:45

Duncan

96.4k15 gold badges129 silver badges160 bronze badges

2 Comments

user1008537 Over a year ago

I should note that I'm practising for coding interviews and am tryingt o get solutions with O(1) memory using iterators.

chepner Over a year ago

A generator can also filter items, so n for n in sequence if predicate(n) may be useful without modifying any individual input value.

Collectives™ on Stack Overflow

Do iterators save memory in Python?

2 Answers 2

8 Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

8 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related