1

I've written a function called size_subsets that returns all subsets of a certain size, when passed a list of cities (numbers). However, restating the function using izip() instead of two for-yield blocks completely breaks the behavior.

The second method below restates the first method using izip(), but for some reason it returns nothing at the top level. Can anyone help me figure out why this is?

Print statements show that SOME (not all) of the correct subsets do get yielded at the bottom-most level of the recursion in size_subsets_broken, but even these aren't making it to the top level for some reason.

def size_subsets(cities, size, sofar):
    if not size:
        yield sofar
        return
    elif len(cities) < size:
        return
    else:
        curr_city = cities.pop()
        for a in size_subsets(cities[:], size - 1, sofar | {curr_city}):
            yield a
        for b in size_subsets(cities[:], size, sofar):
            yield b


def size_subsets_broken(cities, size, sofar):
    if not size:
        yield sofar
        return
    elif len(cities) < size:
        return
    else:
        curr_city = cities.pop()
        inclusive = size_subsets_broken(cities[:], size - 1, sofar | {curr_city})
        exclusive = size_subsets_broken(cities[:], size, sofar)

        for incl_subset, excl_subset in izip(inclusive, exclusive):
            yield incl_subset
            yield excl_subset


print list(size_subsets([1, 2, 3], 2, set()))           # [set([2, 3]), set([1, 3]), set([1, 2])]
print list(size_subsets_broken([1, 2, 3], 2, set()))    # []
1
  • Your first method yields all the a values, followed by all the b values. Your second method alternates between them. Do you want that? Commented May 22, 2016 at 8:16

1 Answer 1

1

I suspect you misunderstand how izip() works. It just plain stops when the shortest input iterable is exhausted, and there's no reason to believe that your inclusive and exclusive always have the same length.

>>> from itertools import izip
>>> for i in izip(range(10), [6]):
...     print i
(0, 6)

>>> for i in izip(range(10), []):
...     print i

Note that there's only one output in the first example, and no output at all in the second. That's all expected.

DETAILS

The top-level call to size_subsets_broken() creates a generator-iterator (gi) object. The call to list() forces the latter to do something.

It creates inclusive and exclusive gi objects, with arguments (ignoring sofar) [1, 2], 1 and [1, 2], 2. izip() then attempts to combine them.

izip() first tries to get a value from inclusive (which eventually fails to deliver anything, which is why the top-level gi never yields anything either - indeed, it never even attempts to force exclusive to yield anything, because inclusive is empty - izip() "just plain stops when the shortest input iterable is exhausted").

Recall that the arguments to the top-level inclusive were [1, 2], 1. It creates gi objects for [1], 0 and [1], 1. Its izip() pushes the [1], 0 branch to yield a value, which it does, via the if not size test. So its izip() goes on to push the [1], 1 branch for a value.

That in turn creates [], 0 and [], 1 gi branches. Another layer of izip() pushes the first of those to yield a value (again via the if not size test), but the second branch yields no value because of if len(cities) < size: return. So the izip() one level up gives up, and the gi it belongs to yields nothing.

Which propagates up the chain: at each level, izip() finds at least one empty iterator, so the body of the for ... in izip(...): loop is never entered (at any level).

This isn't a problem with izip(), by the way. It simply doesn't make sense to try to use izip() in this algorithm.

Sign up to request clarification or add additional context in comments.

2 Comments

That's right, thanks. In this case I'm simply trying to exhaust both generators, not iterate through them in parallel until the shortest generator was exhausted. I was trying to save a few lines of code by using izip(), but that's simply not what izip is used for. Thanks much!
You could instead use itertools.chain(inclusive, exclusive), which exhausts its input iterables one at a time, left to right.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.