Why does adding elements to a set take longer than adding elements to a list in python? I created a loop and iterated over 1000000 elements added it to a list and a set. List is consistently taking around 10 seconds vs set which is taking around 20 seconds.
2 Answers
Both operations are O(1) amortized time-complexity.
Appending elements to a list has a lower coefficient because it does not need to hash the element first, nor does it need to check/handle hash collisions.
In the case of adding x into a set, Python needs to compute hash(x) first, because keeping the hash of all elements is what allows sets to have fast O(1) membership checks (compared to O(n) membership checks for lists).
8 Comments
hash(x) - it also may have to test arr[i] == x several times when there is a hash collision.The time complexity for appending to a list is the same as for adding to a set - both are O(1) amortised operations, meaning on average they each take a constant amount of time, although occasionally the operation may take more than that constant amount of time in order to dynamically resize the array that the data is stored in.
However, just because both are O(1) doesn't mean they take the same amount of time:
- Appending to a list should be faster, because a list is implemented as a dynamic array, so appending an element just requires writing that element at the right index (which is already known), and increasing the length by 1.
- In contrast, adding to a set is slower, because it requires computing the hash of the element to find the index to start looking from, then testing indices in some sequence to see if the element is already there (by testing if there is an element at the index, and if so, whether it equals the element being inserted) until either finding it, or finding an empty space where the element should be added.