128

Assuming that I have a list with a huge number of items,

l = [ 1, 4, 6, 30, 2, ... ]

I want to get the number of items from that list, where an item satisfies a certain condition. My first thought was:

count = len([i for i in l if my_condition(l)])

But if the filtered list also has a great number of items, I think that creating a new list for the filtered result is just a waste of memory. For efficiency, IMHO, the above call can't be better than:

count = 0
for i in l:
    if my_condition(l):
        count += 1

Is there any functional-style way to get the # of items that satisfy the condition without generating a temporary list?

1
  • 5
    The choice between generators and lists is a choice between execution time and memory consumption. You would be surprised how often the results are counter intuitive if you profile the code. Premature optimization is the root of all evil. Commented Mar 13, 2013 at 1:00

6 Answers 6

161

You can use a generator expression:

>>> l = [1, 3, 7, 2, 6, 8, 10]
>>> sum(1 for i in l if i % 4 == 3)
2

or even

>>> sum(i % 4 == 3 for i in l)
2

which uses the fact that True == 1 and False == 0.

Alternatively, you could use itertools.imap (python 2) or simply map (python 3):

>>> def my_condition(x):
...     return x % 4 == 3
... 
>>> sum(map(my_condition, l))
2
Sign up to request clarification or add additional context in comments.

Comments

33

You want a generator comprehension rather than a list here.

For example,

l = [1, 4, 6, 7, 30, 2]

def my_condition(x):
    return x > 5 and x < 20

print sum(1 for x in l if my_condition(x))
# -> 2
print sum(1 for x in range(1000000) if my_condition(x))
# -> 14

Or use itertools.imap (though I think the explicit list and generator expressions look somewhat more Pythonic).

Note that, though it's not obvious from the sum example, you can compose generator comprehensions nicely. For example,

inputs = xrange(1000000)      # In Python 3 and above, use range instead of xrange
odds = (x for x in inputs if x % 2)  # Pick odd numbers
sq_inc = (x**2 + 1 for x in odds)    # Square and add one
print sum(x/2 for x in sq_inc)       # Actually evaluate each one
# -> 83333333333500000

The cool thing about this technique is that you can specify conceptually separate steps in code without forcing evaluation and storage in memory until the final result is evaluated.

Comments

14

This can also be done using reduce if you prefer functional programming

reduce(lambda count, i: count + my_condition(i), l, 0)

This way you only do 1 pass and no intermediate list is generated.

Comments

12

you could do something like:

l = [1,2,3,4,5,..]
count = sum(1 for i in l if my_condition(i))

which just adds 1 for each element that satisfies the condition.

Comments

2
from itertools import imap
sum(imap(my_condition, l))

1 Comment

imap is not available with current Python.
1

You can use len(list(filter(my_condition, l))). filter returns an iterable with all values such that a function returns True when applied. Using len after that solves this problem. A filter object doesn't support len, i.e. __len__, so the list constructor must be called first.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.