I have the defined Type: data Tree = Node Tree Tree | Leaf Int | NIL. I want create a method delete :: Tree -> Int -> Tree which removes all Leaf's with the specific Int given in the second parameter.
2 Answers
If your tree doesn't have any particular structure, you can do
delete NIL _ = NIL
delete (Leaf i) int | i == int = NIL
| otherwise = Leaf i
delete (Node left right) int = Node (delete left int) (delete right int)
Why?
delete NIL _ = NIL because we have to deal with all cases, even the empty trees at the ends. The _ stands for any value that we don't care about.
delete (Leaf i) int | i == int = NIL
| otherwise = Leaf i
because we need to first check | i== int to see whether we want to delete the node. If we do, we replace it with the empty three, NIL. Otherwise, we leave it alone.
delete (Node left right) int = Node (delete left int) (delete right int) because if we're at a node, we need to delete the int from both left and right subtrees.
Aren't you going to end up with a whole load of NILs?
Yes, I suppose that could happen. You could clear up with
prune (Node NIL NIL ) = NIL
prune (Node (Leaf i) NIL ) = Leaf i
prune (Node NIL (Leaf i)) = Leaf i
prune (Node (Leaf i) (Leaf j)) = Node (Leaf i) (Leaf j)
prune (Node left right ) = prune (Node (prune left) (prune right))
prune t = t
The first three lines get rid of a NIL on the left, right or both, and the fourth leaves two leaves alone.
The fifth line only gets called when one of the left or right subtrees of this node is itself a node. Why prune three times? Maybe when you prune left and prune right one or more of them ends up NIL.
The prune t = t deals with both NIL and Leaf i in one neat pattern match.
I would suggest some improvements to AndrewC's answer. While his solution is absolutely correct, it has some potential performance issue.
The issue is: both delete and prune functions create a new copy of the whole tree on every call. That happens regardless of whether an element was actually deleted.
The worst case scenario is deleting a non-existing element.
Let's say we have a really big tree that holds 1M of integers. Since integers are stored in leaves only, the whole tree contains at least 2M-1 of nodes. (Or even more if the tree was not pruned yet thus contains NIL nodes).
When we try to delete a non-existing element from such a huge tree, our delete function will do absolutely nothing but duplicating all 2M of nodes. (And prune will duplicate them again!)
Deleting an existing element is just a tiny bit better. At this case, delete removes one leaf, updates it's parent and duplicates the rest of the tree. prune will probably remove a few more nodes but will duplicate the rest.
Why does that happen?
There are two places where duplication happens.
This line creates a new tree that is completely identical to the argument:
delete (Leaf i) int | ...
| otherwise = Leaf i
Also, this line creates a new tree even if the element is not present in both left and right branches:
delete (Node left right) int = Node (delete left int) (delete right int)
Is it possible to avoid unnecessary duplication?
Yes, of course. We just need to return the argument if we do not modify it.
Here is my version:
delete t i = fst $ delete' t i
where delete' NIL _ = (NIL, True)
delete' t@(Leaf i) int | i == int = (NIL, False)
| otherwise = (t, True)
delete' t@(Node left right) int =
let (left', unchangedLeft) = delete' left int
(right', unchangedRight) = delete' right int
in
if unchangedLeft && unchangedRight
then (t, True)
else (Node left' right', False)
As you can see, I use a helper function delete' that returns a pair of (Tree, Bool) where second element is True if the tree was not changed, and False otherwise.
This function builds a new tree that shares most of the nodes with the original one. It only changes nodes on the path from root to the deleted element.
What about prune ?
The version above does not delete NIL elements. As AndrewC noted, after performing multiple deletes we may have a tree with a lot of NILs. To address this issue, we can either modify prune in the similar way, or we can merely integrate it into the delete:
delete t i = fst $ delete'' t i
where delete'' NIL _ = (NIL, True)
delete'' t@(Leaf i) int | i == int = (NIL, False)
| otherwise = (t, True)
delete'' t@(Node left right) int =
let (left', unchangedLeft) = delete'' left int
(right', unchangedRight) = delete'' right int
in
if unchangedLeft && unchangedRight
then (t, True)
else (newNode left' right', False)
newNode NIL r = r
newNode l NIL = l
newNode l r = Node l r
delete (Leaf n) n' =.