Determine if two binary trees are equal

Question

What would be the efficient algorithm to find if two given binary trees are equal - in structure and content?

Two nodes are equal if their data contents are the same and their left children are equal and their right children are equal. That's all you really need to know. — Hot Licks
– Hot Licks, Commented Oct 12, 2011 at 1:39

score 24 · Accepted Answer · 2012-10-16 19:03:22Z

24

It's a minor issue, but I'd adapt the earlier solution as follows...

eq(t1, t2) =
  t1.data=t2.data && eq(t1.left, t2.left) && eq(t1.right, t2.right)

The reason is that mismatches are likely to be common, and it is better to detect (and stop comparing) early - before recursing further. Of course, I'm assuming a short-circuit && operator here.

I'll also point out that this is glossing over some issues with handling structurally different trees correctly, and with ending the recursion. Basically, there need to be some null checks for t1.left etc. If one tree has a null .left but the other doesn't, you have found a structural difference. If both have null .left, there's no difference, but you have reached a leaf - don't recurse further. Only if both .left values are non-null do you recurse to check the subtree. The same applies, of course, for .right.

You could include checks for e.g. (t1.left == t2.left), but this only makes sense if subtrees can be physically shared (same data structure nodes) for the two trees. This check would be another way to avoid recursing where it is unnecessary - if t1.left and t2.left are the same physical node, you already know that those whole subtrees are identical.

A C implementation might be...

bool tree_compare (const node* t1, const node* t2)
{
  // Same node check - also handles both NULL case
  if (t1 == t2)  return true;

  // Gone past leaf on one side check
  if ((t1 == NULL) || (t2 == NULL))  return false;

  // Do data checks and recursion of tree
  return ((t1->data == t2->data) && tree_compare (t1->left,  t2->left )
                                 && tree_compare (t1->right, t2->right));
}

EDIT In response to a comment...

The running time for a full tree comparison using this is most simply stated as O(n) where n is kinda the size of a tree. If you're willing to accept a more complex bound you can get a smaller one such as O(minimum(n1, n2)) where n1 and n2 are the sizes of the trees.

The explanation is basically that the recursive call is only made (at most) once for each node in the left tree, and only made (at most) once for each node in the right tree. As the function itself (excluding recursions) only specifies at most a constant amount of work (there are no loops), the work including all recursive calls can only be as much as the size of the smaller tree times that constant.

You could analyse further to get a more complex but smaller bound using the idea of the intersection of the trees, but big O just gives an upper bound - not necessarily the lowest possible upper bound. It's probably not worthwhile doing that analysis unless you're trying to build a bigger algorithm/data structure with this as a component, and as a result you know that some property will always apply to those trees which may allow you a tighter bound for the larger algorithm.

One way to form a tigher bound is to consider the sets of paths to nodes in both trees. Each step is either an L (left subtree) or an R (right subtree). So the root is specified with an empty path. The right child of the left child of the root is "LR". Define a function "paths (T)" (mathematically - not part of the program) to represent the set of valid paths into a tree - one path for every node.

So we might have...

paths(t1) = { "", "L", "LR", "R", "RL" }
paths(t2) = { "", "L", "LL", "R", "RR" }

The same path specifications apply to both trees. And each recursion always follows the same left/right link for both trees. So the recursion visits the paths in the itersection of these sets, and the tightest bound we can specify using this is the cardinality of that intersection (still with the constant bound on work per recursive call).

For the tree structures above, we do recursions for the following paths...

paths(t1) intersection paths(t2) = { "", "L", "R" }

So our work in this case is bounded to at most three times the maximum cost of non-recursive work in the tree_compare function.

This is normally an unnecessary amount of detail, but clearly the intersection of the path-sets is at most as large as the number of nodes in the smallest original tree. And whether the n in O(n) refers to the number of nodes in one original tree or to the sum of the nodes in both, this is clearly no smaller than either the minimum or our intersection. Therefore O(n) isn't such a tight bound, but it's still a valid upper bound, even if we're a bit vague which size we're talking about.

edited Oct 16, 2012 at 19:03

answered Sep 28, 2009 at 13:46

user180247

Sign up to request clarification or add additional context in comments.

8 Comments

Rachel Over a year ago

@Steve: Thank you Steve for precise explanation.

Himanshu Jindal Over a year ago

Is the running time of this algorithm O(n^2)?

Sumit Gera Over a year ago

@Steve314 when the first nodes of the trees are same, the algorithm returns true and the comparison stops ?(please correct me if i am wrong).

user180247 Over a year ago

@buzzinga - if the root nodes are both the same node, referenced through the same pointer value, then the algorithm correctly returns true and comparison correctly stops. If the root nodes are both the same node, then the trees are both the same tree. The values within the trees must be the same because they're the same tree.

Sumit Gera Over a year ago

@Steve314 now since the comparison stops, how will the rest of the nodes be compared? and if not compared, how will it be proved that two trees are isomorphic or not?

|

Brian · Accepted Answer · 2009-09-27 05:23:57Z

4

Modulo stack overflow, something like

eq(t1, t2) =
    eq(t1.left, t2.left) && t1.data=t2.data && eq(t1.right, t2.right)

(This generalizes to an equality predicate for all tree-structured algebraic data types - for any piece of structured data, check if each of its sub-parts are equal to each of the other one's sub-parts.)

answered Sep 27, 2009 at 5:23

Brian

119k18 gold badges246 silver badges304 bronze badges

1 Comment

Rachel Over a year ago

Thank you Brian for giving apt explanation.

Or Svantovit · Accepted Answer · 2016-01-05 08:09:57Z

3

We can also do any of the two traversals (pre-order, post-order or in-order) and then compare the results of both the trees. If they are same, we can be sure of their equivalence.

edited Jan 5, 2016 at 8:09

Or Svantovit

2551 gold badge6 silver badges18 bronze badges

answered Aug 31, 2012 at 5:45

Pranav

311 bronze badge

1 Comment

Shrikant Prabhu Over a year ago

It will also work but it's more costly , because you will have to travel each tree twice i.e. O(4n) + O(2n) to go over their traversals. Also if trees are big enough but have different sets of nodes in early levels, you will still have to traverse them till the end instead of breaking out.

Fragsworth · Accepted Answer · 2009-09-27 05:07:41Z

1

A more general term for what you are probably trying to accomplish is graph isomorphism. There are some algorithms to do this on that page.

answered Sep 27, 2009 at 5:07

Fragsworth

36k27 gold badges86 silver badges99 bronze badges

1 Comment

ShreevatsaR Over a year ago

FWIW, tree isomorphism can be checked in linear time.

hipsandy · Accepted Answer · 2013-09-17 22:48:26Z

1

Since it's a proven fact that - it is possible to recreate a binary tree as long as we have the following:

The sequence of nodes that are encountered in an In-Order Traversal.
The sequence of nodes that are encountered in a Pre-Order OR Post-Order Traversal

If two binary trees have the same in-order and [pre-order OR post-order] sequence, then they should be equal both structurally and in terms of values.

Each traversal is an O(n) operation. The traversals are done 4 times in total and the results from the same-type of traversal is compared. O(n) * 4 + 2 => O(n) Hence, the total order of time-complexity would be O(n)

answered Sep 17, 2013 at 22:48

hipsandy

1,0029 silver badges7 bronze badges

Comments

Community · Accepted Answer · 2020-06-20 09:12:55Z

0

I would write it as follows. The following code will work in most functional language, and even in python if your datatypes are hashable (e.g. not dictionaries or lists):

topological equality (same in structure, i.e. Tree(1,Tree(2,3))==Tree(Tree(2,3),1)):

tree1==tree2 means set(tree1.children)==set(tree2.children)
ordered equality:

tree1==tree2 means tree1.children==tree2.children

(Tree.children is an ordered list of children)

You don't need to handle the base cases (leaves), because equality has been defined for them already.

edited Jun 20, 2020 at 9:12

CommunityBot

11 silver badge

answered Oct 12, 2011 at 1:32

ninjagecko

91.5k24 gold badges144 silver badges153 bronze badges

3 Comments

jfs Over a year ago

Could you elaborate how set(tree1.children)==set(tree2.children) would work on Tree(0,Tree(1,Tree(2,3)))==Tree(0,Tree(Tree(1,3),2))?

ninjagecko Over a year ago

@J.F.Sebastian: it would work correctly (i.e. return the result "not equal") as follows: Is 0==0 and (1,(2,3))==((1,3),2)? Let's first check if (1,(2,3))==((1,3),2). Is 1==(1,3) and (2,3)==2? No, because a tree is not equal to an atom. Therefore it is not true that (1,(2,3))==((1,3),2). Therefore it is not true that (0,(1,(2,3)))==(0,((1,3),2)). The idea is that set equality is recursively defined on the equality of its elements, and therefore an equality check will innately be recursive.

jfs Over a year ago

why do you skip the 1==2 and (2,3)==(1,3) check? Does your set() definition implies a certain order for elements? Otherwise it is very expensive as an implementation. It might be ok as a mere definition (not implying any implementation) though the word topological implies for me that nodes values are not important and only the shape of the trees matters e.g., to a topologist, a coffee cup and a donut are the same thing

Savannah Madison · Accepted Answer · 2020-12-01 06:19:58Z

0

bool identical(node* root1,node* root2){
    if(root1 == NULL && root2 == NULL)
        return true;
    if(root1==NULL && root2!=NULL || root1!=NULL && root2 == NULL)
        return false;
    if(root1->data == root2->data){
        bool lIdetical = identical(root1->left,root2->left);
        if(!lIdentical)
            return false;
        bool rIdentical = identical(root1->right,root2->identical);
        return lIdentical && rIdentical;
    }
    else{
        printf("data1:%d vs data2:%d",root1->data,root2->data);
        return false;
    }
}

I do not know if this is the most effecient but I think this works.

answered Dec 1, 2020 at 6:19

Savannah Madison

6773 gold badges10 silver badges22 bronze badges

Collectives™ on Stack Overflow

Determine if two binary trees are equal

7 Answers 7

8 Comments

1 Comment

1 Comment

1 Comment

Comments

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

8 Comments

1 Comment

1 Comment

1 Comment

Comments

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related