21-3 Tarjan's off-line least-common-ancestors algorithm

The least common ancestor of two nodes uu and vv in a rooted tree TT is the node ww that is an ancestor of both uu and vv and that has the greatest depth in TT. In the off-line least-common-ancestors problem, we are given a rooted tree TT and an arbitrary set P={{u,v}}P = \{\{u, v\}\} of unordered pairs of nodes in TT, and we wish to determine the least common ancestor of each pair in PP.

To solve the off-line least-common-ancestors problem, the following procedure performs a tree walk of TT with the initial call LCA(T.root)\text{LCA}(T.root). We assume that each node is colored WHITE\text{WHITE} prior to the walk.

cpp LCA(u) MAKE-SET(u) FIND-SET(u).ancestor = u for each child v of u in T LCA(v) UNION(u, v) FIND-SET(u).ancestor = u u.color = BLACK for each node v such that {u, v} ∈ P if v.color == BLACK print "The least common ancestor of" u "and" v "is" FIND-SET(v).ancestor

a. Argue that line 10 executes exactly once for each pair {u,v}P\{u, v\} \in P.

b. Argue that at the time of the call LCA(u)\text{LCA}(u), the number of sets in the disjoint-set data structure equals the depth of uu in TT.

c. Prove that LCA\text{LCA} correctly prints the least common ancestor of uu and vv for each pair {u,v}P\{u, v\} \in P.

d. Analyze the running time of LCA\text{LCA}, assuming that we use the implementation of the disjoint-set data structure in Section 21.3.

a. Suppose that we let LCA\le_{LCA} to be an ordering on the vertices so that uLCAvu \le_{LCA} v if we run line 7 of LCA(u)\text{LCA}(u) before line 7 of LCA(v)\text{LCA}(v). Then, when we are running line 7 of LCA(u)\text{LCA}(u), we immediately go on to the for loop on line 8.

So, while we are doing this for loop, we still haven't called line 7 of LCA(v)\text{LCA}(v). This means that v.colorv.color is white, and so, the pair {u,v}\{u, v\} is not considered during the run of LCA(u)\text{LCA}(u). However, during the for loop of LCA(v)\text{LCA}(v), since line 7 of LCA(u)\text{LCA}(u) has already run, u.color=blacku.color = black. This means that we will consider the pair {v,u}\{v, u\} during the running of LCA(v)\text{LCA}(v).

It is not obvious what the ordering LCA\le_{LCA} is, as it will be implementation dependent. It depends on the order in which child vertices are iterated in the for loop on line 3. That is, it doesn't just depend on the graph structure.

b. We suppose that it is true prior to a given call of LCA\text{LCA}, and show that this property is preserved throughout a run of the procedure, increasing the number of disjoint sets by one by the end of the procedure. So, supposing that uu has depth dd and there are dd items in the disjoint set data structure before it runs, it increases to d+1d + 1 disjoint sets on line 1. So, by the time we get to line 4, and call LCA\text{LCA} of a child of uu, there are d+1d + 1 disjoint sets, this is exactly the depth of the child. After line 4, there are now d+2d + 2 disjoint sets, so, line 5 brings it back down to d+1d + 1 disjoint sets for the subsequent times through the loop. After the loop, there are no more changes to the number of disjoint sets, so, the algorithm terminates with d + 1\text{d + 1} disjoint sets, as desired. Since this holds for any arbitrary run of LCA\text{LCA}, it holds for all runs of LCA\text{LCA}.

c. Suppose that the pair uu and vv have the least common ancestor ww. Then, when running LCA(w)\text{LCA}(w), uu will be in the subtree rooted at one of ww's children, and vv will be in another. WLOG, suppose that the subtree containing uu runs first.

So, when we are done with running that subtree, all of their ancestor values will point to ww and their colors will be black, and their ancestor values will not change until LCA(w)\text{LCA}(w) returns. However, we run LCA(v)\text{LCA}(v) before LCA(w)\text{LCA}(w) returns, so in the for loop on line 8 of LCA(v)\text{LCA}(v), we will be considering the pair {u,v}\{u, v\}, since u.color=blacku.color = black. Since u.ancestoru.ancestor is still ww, that is what will be output, which is the correct answer for their LCA\text{LCA}.

d. The time complexity of lines 1 and 2 are just constant. Then, for each child, we have a call to the same procedure, a UNION\text{UNION} operation which only takes constant time, and a FIND-SET\text{FIND-SET} operation which can take at most amortized inverse Ackerman's time. Since we check each and every thing that is adjacent to uu for being black, we are only checking each pair in PP at most twice in lines 8-10, among all the runs of LCA\text{LCA}. This means that the total runtime is O(Tα(T)+P)O(|T|\alpha(|T|) + |P|).