5

I have the following code to find the diameter of a tree:

ALGORITHM: TREE_DIAMETER (T)

maxlength ← 0
for s ← 0 to s < |V[T]|
      do temp ← BSF(T, S)
            if maxlength < temp
                   maxlength ← temp
                   increment s by 1
return maxlength

But this algorithm does not run in linear time. Preserving the details of the above code as much as possible (Eg: using BSF), is it possible to optimize it to linear time? You can use pseudo-code.

2
  • What does BFS(T, S) exactly return? What is S(the second argument of BFS)? Commented Sep 3, 2014 at 17:19
  • the title is really confusing: there's no recursion at all but you want a linear algorithm. Commented Sep 3, 2014 at 23:32

2 Answers 2

12

Here is a simple algorithm with linear time complexity:
1)Pick an arbitrary vertex v.
2)Find the furthest vertex from v using BFS(let's call it u).
3)Find the furthest vertex from u using BFS one more time(let's call it t).
Then distance(u, t) is the diameter.
BFS is called only twice, so the time complexity is linear.

Sign up to request clarification or add additional context in comments.

2 Comments

It's definitely linear time. Now prove it's correct.
Proof of the above algorithm(exercise 9-1) courses.csail.mit.edu/6.046/fall01/handouts/ps9sol.pdf
12

As supplementary to the other answer, a proof can be done as follow:

Let's first denote the two ends of the diameter in the tree as s and t, and the function d(u, v) denotes the distance between node u and node v.

Now we choose an arbitrary node X, then we have 2 cases:

  1. X is on the diameter;
  2. X is not on the diameter.

For case 1, it's easy to see that by doing (2) (the second step in the algorithm described in @user2040251's answer), we will get either d(X, s) or d(X, t). If we get something else, say d(X, u), then u and s or t can form a longer path than d(s, t), which contradicts the fact. Therefore, by doing (3), we will get d(s, t).

For case 2, by doing (2), we have 2 cases:

  1. The longest path starting from X contains at least 1 point on the diameter;
  2. The longest path starting from X doesn't share any points with the diameter.

For case 2.1, let's denote the first intersection as Y. Because of case 1, we know the longest path starting from Y must end at either s or t. Therefore, in this case, the longest path starting from X must end at either s or t. Consequently, by doing (3), we will get d(s, t).

For case 2.2, let's denote the other end of the longest path starting from X as Z. Since neither X or Z is on the diameter, and given X, Z, s, t are on the same tree, we can conclude that there must be a node V on the path X to Z, links to a node W on the path s to t. Because X to Z is the longest path starting from X, so d(X, V) + d(V, W) + d(W, t) < d(X, Z). Similarly, we have d(Z, V) + d(V, W) + d(W, s) < d(X, Z). Adding them up and simplify will give us: d(X, Z) > 2d(V, W) + d(s, t) > d(s, t), which contradicts with the fact that s to t is the diameter.

A graph that illustrates case 2.2:

s             Z
|             |
|             |
|             |
W-------------V
|             |
|             |
|             |
t             X

So we have:

d(X, V) + d(V, W) + d(W, t) < d(X, Z) because X to Z is the longest path starting from X;

d(Z, V) + d(V, W) + d(W, s) < d(X, Z) because X to Z is the longest path starting from Z;

Adding up 2 expressions:

d(X, Z) + 2d(V, W) + d(s, t) < 2d(X, Z)

Finally, we have d(X, Z) > 2d(V, W) + d(s, t) > d(s, t), which contradicts the fact.

2 Comments

The reason for d(Z, V) + d(V, W) + d(W, s) < d(X, Z) should be X to Z is the longest path starting from Z, NOT X.
@YuxuanChen yup you are right :) it's a typo, and has been fixed. Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.