44

Are there any self-balancing binary search tree (RED-BLACK, AVL or others) built-in types in Python 2.7 or Python 3.x?

I am looking for something equivalent to Java's TreeMap or TreeSet.

If there are no such built-ins, why have they been ommited? Is there a special reason, for not including such tools?

3
  • 1
    There are none in the standard library. Why one hasn't been included is impossible to answer. The rest of your question is a request for external resources, which is off-topic. Commented Jul 25, 2013 at 12:03
  • Near-dupe of stackoverflow.com/questions/2298165/… Commented Jul 25, 2013 at 12:17
  • 2
    There's a Heap implementation which is a sort of binary tree. Look for heapq Commented Feb 16, 2022 at 5:21

2 Answers 2

42

There's no special reason, to my knowledge - I'd guess that the reason is that for so many applications the highly-tuned dict and set implementations (which are hash tables) work well. They're good enough in most cases. There are definitely situations where you need the performance characteristics of balanced binary search trees (like ordered traversal based on key- rather than addition-order), but those are far enough off the beaten path that people are happy with grabbing a third-party package in that case.

I've had a good experience using the bintrees package on PyPI. This has implementations of unbalanced, AVL and red-black binary trees, in both pure Python and as extensions written in Cython.

I think the rest of the reason is essentially historical accident. If the person who wrote bintrees lobbied for its inclusion in the stdlib, and was willing to put up with the constraints that imposes on maintenance and releases, it would probably go in. (Although the Cython dependency would cause a problem, I'd guess.)

Algorithmic complexity:

For hash tables (like dicts or sets), insertion and lookup are O(1), while for a balanced tree these are O(log(n)). In-order traversal of keys is O(n) in a tree, but to do the same thing with a hash table you need to sort the keys first, so it's O(n*log(n)). When you're picking which kind of data structure to use, you need to think about which operations you're going to be using, and pick the tradeoff that makes the most sense in your application.

Sign up to request clarification or add additional context in comments.

10 Comments

This makes a lot of sense. I'm guessing that probably dict and set are actually binary search trees, or maybe something even more efficient. I will make some research in how exactly these are implemented.
@ChirilaAlexandru: dict and set are hash tables.
@ChirilaAlexandru: hash table lookups are O(1), rather than O(n) - this is why they're so often what you want. This is sometimes balanced by the fact that choosing a good hash function can be tricky - if you have lots of collisions a lot of the benefit disappears.
Or rather, hash tables in the worst case can be O(n), but with good hash functions are O(1) in the expected case. They do use more memory than a tree of the same size to help ensure the expected case actually happens, but the extra memory is also used to amortize the cost expanding the data structure as more items are added as well.
i think sorted containers is recommended now pypi.org/project/sortedcontainers instead of the bintree library
|
6

You won't find any trees in the standard library. Python heavily uses dictionary that is hash table for its internal (object, classes and modules are all based on dicts). Therefore dicts has been greatly optimized. This make the needs for search trees much smaller. Also to be efficient such trees would have been implemented in an extension type.

4 Comments

How do dicts make the need for search trees smaller? And what is an extension type?
dict = hash table. extension type = builtin type
To my understanding dicts don't replace search trees; they each have their own use cases
One of the main reasons one would want balanced bst instead of hashtable is the possibility to iterate all items that are less or larger than one specified, which is very efficient in case of database indexes. There's no such possibility in a hashtable without extra calculations since it's unordered

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.