Complexity of in operator in Python

Question

What is the complexity of the in operator in Python? Is it theta(n)?

Is it the same as the following?

def find(L, x):
   for e in L:
       if e == x:
           return True
   return False

L is a list.

It depends on the type of container, since using it with a dictionary or set will be much faster than with an array. — Greg Hewgill
– Greg Hewgill, Commented Dec 14, 2012 at 18:19
@Rastegar L doesn't imply a list. seq is the most common choice where one wants to imply a list. L is a terrible variable name. Single letter ones are bad, and the capital implies it's a class. Even if it was something in particular, Python is dynamic, so state it explicitly in a case like this. — Gareth Latty
– Gareth Latty, Commented Dec 14, 2012 at 18:50
@GarethLatty Using lst is also a good name to define a list — vmemmap
– vmemmap, Commented Sep 24, 2020 at 20:02

Peter Mortensen · Accepted Answer · 2018-07-31 00:32:31Z

245

The complexity of in depends entirely on what L is. e in L will become L.__contains__(e).

See this time complexity document for the complexity of several built-in types.

Here is the summary for in:

list - Average: O(n)
set/dict - Average: O(1), Worst: O(n)

The O(n) worst case for sets and dicts is very uncommon, but it can happen if __hash__ is implemented poorly. This only happens if everything in your set has the same hash value.

edited Jul 31, 2018 at 0:32

Peter Mortensen

31.4k22 gold badges110 silver badges134 bronze badges

answered Dec 14, 2012 at 18:20

Andrew Clark

210k36 gold badges285 silver badges310 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Josh Sherick Over a year ago

Does anyone happen to know the complexity of the "in" operator for an OrderedDict?

Josh Sherick Over a year ago

After some testing, I can confirm that the complexity of OrderedDict in Python 2.7 appears to be O(1) in the average case.

mksm Over a year ago

@Josh Sherick you don't have to provide tests, all you need are the sources of the OrderedDict, and as you could find out: OrderedDict is inherited from dict, so the most operations (of course, with exceptions) have the same complexity.

Inherited Geek Over a year ago

Is the time complexity of "in" operator O(n) for tuple as well?

juanpa.arrivillaga Over a year ago

@whitehat linear.

|

kindall · Accepted Answer · 2014-04-24 15:08:02Z

25

It depends entirely on the type of the container. Hashing containers (dict, set) use the hash and are essentially O(1). Typical sequences (list, tuple) are implemented as you guess and are O(n). Trees would be average O(log n). And so on. Each of these types would have an appropriate __contains__ method with its big-O characteristics.

edited Apr 24, 2014 at 15:08

answered Dec 14, 2012 at 18:19

kindall

185k36 gold badges291 silver badges321 bronze badges

7 Comments

Woot4Moo Over a year ago

of value is to include the overhead of generating the hash.

Dave Over a year ago

Hashing data types include dict and set (as wells as potentially others)

abarnert Over a year ago

@Woot4Moo: When you're talking about asymptotic complexity, that isn't relevant. The overhead of generating the hash is constant. When you're dealing with small values of N, profiling becomes important, because, say, 100 >> 2N for small N. But that's a separate issue from what the OP was asking about; for huge N, 100 << 2N, which is what complexity is all about.

Woot4Moo Over a year ago

@abarnert well it actually is quite relevant, as you don't arbitrarily choose data structures. You must consider the use and most common ways the structure will be used, so it actually is important to consider the amount of time for a hash function, especially in a scenario where the has must be computed per iteration of a program.

abarnert Over a year ago

@Woot4Moo: If someone is asking about asymptotic complexity, either (a) they expect to deal with a large N, or (b) they're an idiot. I'm assuming the OP is case (a), but either way, the constant factor isn't relevant to the answer.

|

Marcin · Accepted Answer · 2012-12-14 18:20:15Z

-1

It depends on the container you're testing. It's usually what you'd expect - linear for ordered datastructures, constant for the unordered. Of course, there are both types (ordered or unordered) which might be backed by some variant of a tree.

answered Dec 14, 2012 at 18:20

Marcin

50.1k18 gold badges137 silver badges207 bronze badges

5 Comments

Marcin Over a year ago

@ZoranPavlovic A in B tests whether A is in B.

dedObed Over a year ago

I'd definitely expect logarithmic time in an ordered structure.

Marcin Over a year ago

@dedObed Why would you expect that? Do you expect python to already know whether or not your data are sorted?

dedObed Over a year ago

Because if there is a container designed to be ordered, the obvious reason is to allow for logarithmic lookups. But I guess it's just a naming issue, I'd use "linear" where you wrote "ordered" and all would be fine. (In my head -- English as second language here.)

slothrop Over a year ago

"Ordered" in Python usually means that the iterator yields elements in the order they were added to the collection. It doesn't usually refer to the case where they are yielded in sorted order (like from a heap), where you'd indeed expect logarithmic time. Maybe this is a case of different terminology between languages, because C++ often seems to use "ordered" in the sense you mean it here.

Collectives™ on Stack Overflow

Complexity of in operator in Python

3 Answers 3

7 Comments

7 Comments

5 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

7 Comments

7 Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related