What algorithm is the built in sort() method in Python using? Is it possible to have a look at the code for that method?
4 Answers
Sure! The code's here: listobject.c, starting with function islt and proceeding for QUITE a while ;-). As the file extension suggests, it's C code. You'll also want to read this for a textual explanation, results, etc etc: listsort.txt
If you prefer reading Java code than C code, you could look at Joshua Bloch's implementation of timsort in and for Java (Joshua's also the guy who implemented, in 1997, the modified mergesort that's still used in Java, and one can hope that Java will eventually switch to his recent port of timsort).
Some explanation of the Java port of timsort is in this request for enhancement1, the diff is here2 (with pointers to all needed files), the key file is here3 -- FWIW, while I'm a better C programmer than Java programmer, in this case I find Joshua's Java code more readable overall than Tim's C code ;-).
Editor's notes
- Archive link: Bug ID: 6804124 - Replace "modified mergesort" in java.util.Arrays.sort with timsort
- Archive link: jdk7/tl/jdk: changeset 1423:bfd7abda8f79 (6804124)
- Dead link. This may be the modern equivalent but I don't know Java: TimSort.java at master - openjdk/jdk
4 Comments
list_ass_item() does. :)listsort.txt adds some notes that address common confusions.In early versions of Python, the sort function implemented a modified version of quicksort. However, in 2.3 this was replaced with an adaptive mergesort algorithm, in order to provide a stable sort by default.
2 Comments
Since python version 3.11, sort() now uses a version of powersort, a merge-sort algorithm that takes advantage of runs: sequences of already-sorted values in the data. When the length is less than 64, python switches to binary insertion sort.
python implementation details: https://github.com/python/cpython/blob/main/Objects/listsort.txt
1 Comment
It uses an adaptive stable mergesort sped-up by looking for pre-existing ascending and descending runs. It is called TimSort, named after its creator.
sort()method, or what the formatting to the interpreter is, but it's got to be in there somewhere, and I bet it's implemented in C for speed concerns.