141

What algorithm is the built in sort() method in Python using? Is it possible to have a look at the code for that method?

5
  • 14
    Of course it's possible to look at the code for the method - Python is an open-source project. The method is probably implemented in C, however, so you'll have to know a bit about C to make any sense of it. Commented Oct 4, 2009 at 20:50
  • Does the version matter? Commented Oct 4, 2009 at 20:50
  • @melder: No =) I just want to have a look at a pro algorithm :P @chris: how? Commented Oct 4, 2009 at 20:51
  • 3
    Download the source code to the Python interpreter. I don't know where they implement the sort() method, or what the formatting to the interpreter is, but it's got to be in there somewhere, and I bet it's implemented in C for speed concerns. Commented Oct 4, 2009 at 20:54
  • Here is an example of it being used Commented Feb 4, 2014 at 17:21

4 Answers 4

151

Sure! The code's here: listobject.c, starting with function islt and proceeding for QUITE a while ;-). As the file extension suggests, it's C code. You'll also want to read this for a textual explanation, results, etc etc: listsort.txt

If you prefer reading Java code than C code, you could look at Joshua Bloch's implementation of timsort in and for Java (Joshua's also the guy who implemented, in 1997, the modified mergesort that's still used in Java, and one can hope that Java will eventually switch to his recent port of timsort).

Some explanation of the Java port of timsort is in this request for enhancement1, the diff is here2 (with pointers to all needed files), the key file is here3 -- FWIW, while I'm a better C programmer than Java programmer, in this case I find Joshua's Java code more readable overall than Tim's C code ;-).


Editor's notes

  1. Archive link: Bug ID: 6804124 - Replace "modified mergesort" in java.util.Arrays.sort with timsort
  2. Archive link: jdk7/tl/jdk: changeset 1423:bfd7abda8f79 (6804124)
  3. Dead link. This may be the modern equivalent but I don't know Java: TimSort.java at master - openjdk/jdk
Sign up to request clarification or add additional context in comments.

4 Comments

I want to know what the function list_ass_item() does. :)
Performs assignment to an item of the list (just like list_ass_slice performs assignment to a slice of the list), nothing to do with sorting. I guess the abbreviation of "assignment" makes the name funny...
The current version of listsort.txt adds some notes that address common confusions.
What are the theoretical time and space complexities of timsort?
12

In early versions of Python, the sort function implemented a modified version of quicksort. However, in 2.3 this was replaced with an adaptive mergesort algorithm, in order to provide a stable sort by default.

2 Comments

quicksort isn't 'deemed unstable' - by the commonly used definitions, quicksort is unstable - that is the original ordering of two objects is not preserved, if they are equal during this sort. Having an unstable sort algorithm means that you have to come up with all sorts of 'magic' to sensibly sort data with multiple criteria, rather than then simply being able to chain sort calls - in timsort if you want to sort by D.O.B and then name, you simply sort by name first, and then D.O.B. You can't do that with quicksort
@TonySuffolk66 I think the author of this answer may have, at the time, been reading the documentation and misunderstood "stable" as referring to software quality - rather than to a property of sorting algorithms. I edited the answer so that it is clear about the meaning, and added a reference link for that concept.
5

Since python version 3.11, sort() now uses a version of powersort, a merge-sort algorithm that takes advantage of runs: sequences of already-sorted values in the data. When the length is less than 64, python switches to binary insertion sort.

python implementation details: https://github.com/python/cpython/blob/main/Objects/listsort.txt

1 Comment

How does it differ from timsort?
0

It uses an adaptive stable mergesort sped-up by looking for pre-existing ascending and descending runs. It is called TimSort, named after its creator.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.