7

Why does adding elements to a set take longer than adding elements to a list in python? I created a loop and iterated over 1000000 elements added it to a list and a set. List is consistently taking around 10 seconds vs set which is taking around 20 seconds.

0

2 Answers 2

13

Both operations are O(1) amortized time-complexity.

Appending elements to a list has a lower coefficient because it does not need to hash the element first, nor does it need to check/handle hash collisions.

In the case of adding x into a set, Python needs to compute hash(x) first, because keeping the hash of all elements is what allows sets to have fast O(1) membership checks (compared to O(n) membership checks for lists).

Sign up to request clarification or add additional context in comments.

8 Comments

Thanks @wim , so in case of set , the value of constant is higher ?
Yes. It's slower in absolute, but it's not becoming any slower depending on the size of the set/list.
Not just hash(x) - it also may have to test arr[i] == x several times when there is a hash collision.
Does adding involve o(log n) check to see if item already exists in the set before performing the insertion?
The hash will tell where in memory to populate an element, if that place is already filled by an unequal object then you have a deterministic "perturbation" algorithm telling you how to find some other index to use instead. The storage will resize itself dynamically so that usually only have to probe 0 or 1 times, regardless of the size of the set. This is how you can get O(1) on average, but sometimes it will be significantly slower when there is need to reallocate storage (that last part is true for both set and list, by the way).
|
8

The time complexity for appending to a list is the same as for adding to a set - both are O(1) amortised operations, meaning on average they each take a constant amount of time, although occasionally the operation may take more than that constant amount of time in order to dynamically resize the array that the data is stored in.

However, just because both are O(1) doesn't mean they take the same amount of time:

  • Appending to a list should be faster, because a list is implemented as a dynamic array, so appending an element just requires writing that element at the right index (which is already known), and increasing the length by 1.
  • In contrast, adding to a set is slower, because it requires computing the hash of the element to find the index to start looking from, then testing indices in some sequence to see if the element is already there (by testing if there is an element at the index, and if so, whether it equals the element being inserted) until either finding it, or finding an empty space where the element should be added.

2 Comments

"then testing each element from that index onwards" <-- this is not correct, Python does not use linear probing.
@wim You're right, I misremembered this detail, so thanks for the correction. Wikipedia doesn't seem to have an article on the particular scheme Python does use so I've replaced the link with a generic one.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.