Skip to content

21.4 Analysis of union by rank with path compression

21.4-1

Prove Lemma 21.4.

The lemma states:

For all nodes xx, we have x.rankx.p.rankx.rank \le x.p.rank, with strict inequality if xx.px \ne x.p. The value of x.rankx.rank is initially 00 and increases through time until xx.px \ne x.p; from then on, x.rankx.rank does not change. The value of x.p.rankx.p.rank monotonically increases over time.

The initial value of x.rankx.rank is 00, as it is initialized in line 2 of the MAKE-SET(x)\text{MAKE-SET}(x) procedure. When we run LINK(x,y)\text{LINK}(x, y), whichever one has the larger rank is placed as the parent of the other, and if there is a tie, the parent's rank is incremented. This means that after any LINK(y,x)\text{LINK}(y, x), the two nodes being linked satisfy this strict inequality of ranks.

Also, if we have that xx.px \ne x.p, then, we have that xx is not its own set representative, so, any linking together of sets that would occur would not involve xx, but that's the only way for ranks to increase, so, we have that x.rankx.rank must remain constant after that point.

21.4-2

Prove that every node has rank at most lgn\lfloor \lg n \rfloor.

We'll prove the claim by strong induction on the number of nodes. If n=1n = 1, then that node has rank equal to 0=lg10 = \lfloor \lg 1 \rfloor. Now suppose that the claim holds for 1,2,,n1, 2, \ldots, n nodes.

Given n+1n + 1 nodes, suppose we perform a UNION\text{UNION} operation on two disjoint sets with aa and bb nodes respectively, where a,bna, b \le n. Then the root of the first set has rank at most lga\lfloor \lg a \rfloor and the root of the second set has rank at most lgb\lfloor \lg b\rfloor.

If the ranks are unequal, then the UNION\text{UNION} operation preserves rank and we are done, so suppose the ranks are equal. Then the rank of the union increases by 11, and the resulting set has rank lga+1lg(n+1)/2+1=lg(n+1)\lfloor\lg a\rfloor + 1 \le\lfloor\lg(n + 1) / 2\rfloor + 1 = \lfloor\lg(n + 1)\rfloor.

21.4-3

In light of Exercise 21.4-2, how many bits are necessary to store x.rankx.rank for each node xx?

Since their value is at most lgn\lfloor \lg n \rfloor, we can represent them using Θ(lg(lg(n)))\Theta(\lg(\lg(n))) bits, and may need to use that many bits to represent a number that can take that many values.

21.4-4

Using Exercise 21.4-2, give a simple proof that operations on a disjoint-set forest with union by rank but without path compression run in O(mlgn)O(m\lg n) time.

MAKE-SET\text{MAKE-SET} takes constant time and both FIND-SET\text{FIND-SET} and UNION\text{UNION} are bounded by the largest rank among all the sets. Exercise 21.4-2 bounds this from about by lgn\lceil \lg n \rceil, so the actual cost of each operation is O(lgn)O(\lg n). Therefore the actual cost of mm operations is O(mlgn)O(m\lg n).

21.4-5

Professor Dante reasons that because node ranks increase strictly along a simple path to the root, node levels must monotonically increase along the path. In other words, if x.rank>0x.rank > 0 and x.px.p is not a root, then level(x)level(x.p)\text{level}(x) \le \text{level}(x.p). Is the professor correct?

Professor Dante is not correct.

Suppose that we had that x.p.rank>A2(x.rank)x.p.rank > A_2(x.rank) but that x.p.p.rank=1+x.p.rankx.p.p.rank = 1 + x.p.rank, then we would have that level(x.p)=0\text{level}(x.p) = 0, but level(x)2\text{level}(x) \ge 2. So, we don't have that level(x)level(x.p)\text{level}(x) \le \text{level}(x.p) even though we have that the ranks are monotonically increasing as we go up in the tree. Put another way, even though the ranks are monotonically increasing, the rate at which they are increasing (roughly captured by the level values) doesn't have to be increasing.

21.4-6 \star

Consider the function α(n)=min{k:Ak(1)lg(n+1)}\alpha'(n) = \min \{k: A_k(1) \ge \lg(n + 1)\}. Show that α(n)3\alpha'(n) \le 3 for all practical values of nn and, using Exercise 21.4-2, show how to modify the potential-function argument to prove that we can perform a sequence of mm MAKE-SET\text{MAKE-SET}, UNION\text{UNION}, and FIND-SET\text{FIND-SET} operations, nn of which are MAKE-SET\text{MAKE-SET} operations, on a disjoint-set forest with union by rank and path compression in worst-case time O(mα(n))O(m \alpha'(n)).

First, observe that by a change of variables, α(2n1)=α(n)\alpha'(2^{n − 1}) = \alpha(n). Earlier in the section we saw that α(n)3\alpha(n) \le 3 for 0n20470 \le n \le 2047. This means that α(n)2\alpha'(n) \le 2 for 0n220460 \le n \le 2^{2046}, which is larger than the estimated number of atoms in the observable universe.

To prove the improved bound O(mα(n))O(m\alpha'(n)) on the operations, the general structure will be essentially the same as that given in the section.

First, modify bound 21.2 by observing that Aα(n)(x.rank)Aα(n)(1)lg(n+1)>x.p.rankA_{\alpha'(n)}(x.rank) \ge A_{\alpha'(n)}(1) \ge \lg(n + 1) > x.p.rank which implies level(x)α(n)\text{level}(x) \le \alpha'(n).

Next, redefine the potential replacing α(n)\alpha(n) by α(n)\alpha'(n). Lemma 21.8 now goes through just as before. All subsequent lemmas rely on these previous observations, and their proofs go through exactly as in the section, yielding the bound.