0

this is a possible way to implement add and delete functions for a BST in Python. It somewhat resembles my ideas of BST in C++. As evident by the code for deleting, I want to delete a node, which I cannot do due to lack of pass by reference in Python as it is present in C++. What is a good alternative way to delete a node apart from del currNode(this does not work). I have one more question to clarify my idea about pass by reference in Python, when a node is being using added using add function, it remains "attached" to the root node, when add is being called recursively. However, when a node is being deleted, it is not being "detached" from the root node. Why is this so?

class node(object):
    def __init__(self, data = None):
        self.data = data
        self.left = None
        self.right = None

class bst(object):
    def __init__(self):
        self.root = None
    
    def add(self, value):
        
        def _add(self, currNode, value):
            if value < currNode.data:
                if currNode.left == None:
                    currNode.left = node(value)
                else:
                    _add(self, currNode.left, value)

            elif value > currNode.data:
                if currNode.right == None:
                    currNode.right = node(value)
                else:
                    _add(self, currNode.right, value)
        
            else:
                print("Duplicate found")

        
        if self.root == None:
            self.root = node(value)
        else:
            _add(self, self.root, value)


    def printBST(self):
        def _printBST(self, currNode):
            if currNode!= None:
                _printBST(self, currNode.left)
                print(currNode.data, end = " ")
                _printBST(self, currNode.right)
        
        if self.root != None:
            _printBST(self,self.root)
    

    def minBST(self,currNode):
        def _minBST(self, currNode):
            if currNode.left == None:
                return currNode.data
            else: return _minBST(self, currNode.left)
        
        if currNode != None:
            return _minBST(self, currNode)
        else:
            return -10000
    
    def deleteValue(self, val):
        
        def deleteNode(self, currNode, value):
            if currNode == None: 
                return
            elif value > currNode.data:
                return deleteNode(self, currNode.right, value)
            elif value < currNode.data:
                return deleteNode(self, currNode.left, value)
            else:
                if currNode.left == None and currNode.right == None:
                    #del currNode
                    currNode.data = None

                elif currNode.right == None:
                    currNode.data = None
                    #The address of currNode does not change
                    #as it happens in C++
                    #currNode = currNode.left

                elif currNode.left == None:
                    currNode.data = None
                    #The address of currNode does not change
                    #currNode = currNode.right
            
                else:
                    minV = self.minBST(currNode.right)
                    currNode.data = minV
                    return deleteNode(self, currNode.right, minV)

        deleteNode(self, self.root, val)
    


if __name__ == '__main__':
    b = bst()
    b.add(50)
    b.add(60)
    b.add(40)
    b.add(30)
    b.add(45)
    b.add(55)
    b.add(100)
    b.printBST()
    b.deleteValue(100)
    print("")
    b.printBST()
 

2 Answers 2

3
+150

node structure and insertion

We start with a simple node structure, but notice the left and right properties can be set at the time of construction -

# btree.py

class node:
  def __init__(self, data, left=None, right=None):
    self.data = data
    self.left = left
    self.right = right

Recursion is a functional heritage and so using it with functional style yields the best results. This means avoiding things like mutation, variable reassignment, and other side effects. Notice how add always constructs a new node rather than mutating an old one. This is the reason we designed node to accept all properties at time of construction -

# btree.py (continued)

def add(t, q):
  if not t:
    return node(q)
  elif q < t.data:
    return node(t.data, add(t.left, q), t.right)
  elif q > t.data:
    return node(t.data, t.left, add(t.right, q))
  else:
    return node(q, t.left, t.right)

inorder traversal and string conversion

After we add some nodes, we need a way to visualize the tree. Below we write an inorder traversal and a to_str function -

# btree.py (continued)

def inorder(t):
  if not t: return
  yield from inorder(t.left)
  yield t.data
  yield from inorder(t.right)


def to_str(t):
  return "->".join(map(str,inorder(t)))

btree object interface

Notice we did not over-complicate our plain functions above by entangling them with a class. We can now define a btree object-oriented interface which simply wraps the plain functions -

# btree.py (continued)

class btree:
  def __init__(self, t=None): self.t = t
  def __str__(self): return to_str(self.t)
  def add(self, q): return btree(add(self.t, q))
  def inorder(self): return inorder(self.t)

Notice we also wrote btree.py as its own module. This defines a barrier of abstraction and allows us to expand, modify, and reuse features without tangling them with other areas of your program. Let's see how our tree works so far -

# main.py

from btree import btree

t = btree().add(50).add(60).add(40).add(30).add(45).add(55).add(100)

print(str(t))
# 30->40->45->50->55->60->100

minimum and maximum

We'll continue working like this, defining plain function that work directly on node objects. Next up, minimum and maximum -

# btree.py (continued)

from math import inf

def minimum(t, r=inf):
  if not t:
    return r
  elif t.data < r:
    return min(minimum(t.left, t.data), minimum(t.right, t.data))
  else:
    return min(minimum(t.left, r), minimum(t.right, r))

def maximum(t, r=-inf):
  if not t:
    return r
  elif t.data > r:
    return max(maximum(t.left, t.data), maximum(t.right, t.data))
  else:
    return max(maximum(t.left, r), maximum(t.right, r))

The btree interface provides only a wrapper of our plain functions -

# btree.py (continued)

class btree:
  def __init__():       # ...
  def __str__():        # ...
  def add():            # ...
  def inorder():        # ...
  def maximum(self): return maximum(self.t)
  def minimum(self): return minimum(self.t)

We can test minimum and maximum now -

# main.py

from btree import btree

t = btree().add(50).add(60).add(40).add(30).add(45).add(55).add(100)

print(str(t))
# 30->40->45->50->55->60->100

print(t.minimum(), t.maximum())     # <-
# 30 100

insert from iterable

Chaining .add().add().add() is a bit verbose. Providing an add_iter function allows us to insert any number of values from another iterable. We introduce it now because we're about to reuse it in the upcoming remove function too -

def add_iter(t, it):
  for q in it:
    t = add(t, q)
  return t
#main.py

from btree import btree

t = btree().add_iter([50, 60, 40, 30, 45, 55, 100])   # <-

print(str(t))
# 30->40->45->50->55->60->100

print(t.minimum(), t.maximum())
# 30 100

node removal and preorder traversal

We now move onto the remove function. It works similarly to the add function, performing a t.data comparison with the value to remove, q. You'll notice we use add_iter here to combine the left and right branches of the node to be deleted. We could reuse inorder iterator for our tree here, but preorder will keep the tree a bit more balanced. That's a different topic entirely, so we won't get into that now -

# btree.py (continued)

def remove(t, q):
  if not t:
    return t
  elif q < t.data:
    return node(t.data, remove(t.left, q), t.right)
  elif q > t.data:
    return node(t.data, t.left, remove(t.right, q))
  else:
    return add_iter(t.left, preorder(t.right))

def preorder(t):
  if not t: return
  yield t.data
  yield from preorder(t.left)
  yield from preorder(t.right)

Don't forget to extend the btree interface -

# btree.py (continued)

class btree:
  def __init__():       # ...
  def __str__():        # ...
  def add():            # ...
  def inorder():        # ...
  def maximum():        # ...
  def minimum():        # ...
  def add_iter(self, it): return btree(add_iter(self.t, it))
  def remove(self, q): return btree(remove(self.t, q))
  def preorder(self): return preorder(self.t)

Let's see remove in action now -

# main.py

from btree import btree

t = btree().add_iter([50, 60, 40, 30, 45, 55, 100])

print(str(t))
# 30->40->45->50->55->60->100

print(t.minimum(), t.maximum())
# 30 100

t = t.remove(30).remove(50).remove(100)      # <-

print(str(t))
# 40->45->55->60

print(t.minimum(), t.maximum())
# 40 60

completed btree module

Here's the completed module we built over the course of this answer -

from math import inf

class node:
  def __init__(self, data, left=None, right=None):
    self.data = data
    self.left = left
    self.right = right

def add(t, q):
  if not t:
    return node(q)
  elif q < t.data:
    return node(t.data, add(t.left, q), t.right)
  elif q > t.data:
    return node(t.data, t.left, add(t.right, q))
  else:
    return node(q, t.left, t.right)

def add_iter(t, it):
  for q in it:
    t = add(t, q)
  return t

def maximum(t, r=-inf):
  if not t:
    return r
  elif t.data > r:
    return max(maximum(t.left, t.data), maximum(t.right, t.data))
  else:
    return max(maximum(t.left, r), maximum(t.right, r))

def minimum(t, r=inf):
  if not t:
    return r
  elif t.data < r:
    return min(minimum(t.left, t.data), minimum(t.right, t.data))
  else:
    return min(minimum(t.left, r), minimum(t.right, r))

def inorder(t):
  if not t: return
  yield from inorder(t.left)
  yield t.data
  yield from inorder(t.right)

def preorder(t):
  if not t: return
  yield t.data
  yield from preorder(t.left)
  yield from preorder(t.right)

def remove(t, q):
  if not t:
    return t
  elif q < t.data:
    return node(t.data, remove(t.left, q), t.right)
  elif q > t.data:
    return node(t.data, t.left, remove(t.right, q))
  else:
    return add_iter(t.left, preorder(t.right))

def to_str(t):
  return "->".join(map(str,inorder(t)))

class btree:
  def __init__(self, t=None): self.t = t
  def __str__(self): return to_str(self.t)
  def add(self, q): return btree(add(self.t, q))
  def add_iter(self, it): return btree(add_iter(self.t, it))
  def maximum(self): return maximum(self.t)
  def minimum(self): return minimum(self.t)
  def inorder(self): return inorder(self.t)
  def preorder(self): return preorder(self.t)
  def remove(self, q): return btree(remove(self.t, q))

have your cake and eat it too

One understated advantage of the approach above is that we have a dual interface for our btree module. We can use it in the traditional object-oriented way as demonstrated, or we can use it using a more functional approach -

# main.py

from btree import add_iter, remove, to_str, minimum, maximum

t = add_iter(None, [50, 60, 40, 30, 45, 55, 100])

print(to_str(t))
# 30->40->45->50->55->60->100

print(minimum(t), maximum(t))
# 30 100

t = remove(remove(remove(t, 30), 50), 100)

print(to_str(t))
# 40->45->55->60

print(minimum(t), maximum(t))
# 40 60

additional reading

I've written extensively about the techniques used in this answer. Follow the links to see them used in other contexts with additional explanation provided -

Sign up to request clarification or add additional context in comments.

13 Comments

Thanks for the answer. I am question though, how is yield responsible for deleting a node from the tree. I am referring to the last else clause of the remove function, where preorder is being called. My initial hunch, is that the data stored in the memory is being returned and being deleted at the same time - due to yield. Please do correct me.
@motiur, the last clause of the remove function will delete the node, t, where t.data is equal to q, but the branches of that node, t.left and t.right must be preserved. To do this, add_iter(t.left, preorder(t.right)) is effectively merge and creates a new tree. preorder(t.right) is used to insert values into t.left one at a time. This is similar to how we create a tree in the example, t = add_iter(None, [50, 60, 40, 30, 45, 55, 100]) but this inserts values from the array [50, 60, ...] into the empty tree, None. Does that make sense?
So, correct me if I am wrong, so you are merging the left subtree and the right subtree when you called add_iter in the last clause of remove -- but what happened to the node containing the data -- did it get destroyed/deleted, or you just simply ignored it - and did not return it.
this is a key difference with functional techniques: instead of manipulating old (existing) values, a new value is created. In the previous branches, we re-create a new vein down the tree, node(t.data, remove(t.left, q), t.right) or node(t.data, t.left, remove(t.right, q)). This is not a copy of the entire tree, rather only the path to t.data == q is recreated. So when t.data == q, node t is ignored and garbage collected, and t.left and t.right are merged to form the new subtree.
If we write t0 = add_iter(None, [1,2,3]) and then print(t1) I will see 1->2->3. Next we define t1 = remove(t0, 2) and print(t1) we will see 1->3. The call to remove does not change t0, so if we write print(t0) again, we still see 1->2->3. This is what it means to be persistent. If other parts of your program hold a reference to t0, integrity of the tree will always be in tact. Only after all references to a node are cleaned up will the nodes actually be destroyed.
|
1

Instead of deleting the node using del you can reassign the parent node and let the child node be collected by garbage collector.

In deleteNode return the new child instead of deleting the node. Assign the returned value to the parent.

def deleteNode(self, currNode, value):
    if not currNode: 
        return currNode
    elif value < currNode.data:
        return deleteNode(self, currNode.left, value)
    elif value > currNode.data:
        return deleteNode(self, currNode.right, value)

    else: 
        if not currNode.right:
            return currNode.left

        if not currNode.left:
            return currNode.right

        temp_val = currNode.right
        mini_val = temp_val.val
        while temp_val.left:
            temp_val = temp_val.left
            mini_val = temp_val.val
        currNode.right = deleteNode(currNode.right,currNode.val)
    return currNode

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.