12-3 Average node depth in a randomly built binary search tree

In this problem, we prove that the average depth of a node in a randomly built binary search tree with nn nodes is O(lgโกn)O(\lg n). Although this result is weaker than that of Theorem 12.4, the technique we shall use reveals a surprising similarity between the building of a binary search tree and the execution of RANDOMIZED-QUICKSORT\text{RANDOMIZED-QUICKSORT} from Section 7.3.

We define the total path length P(T)P(T) of a binary tree TT as the sum, over all nodes xx in TT, of the depth of node xx, which we denote by d(x,T)d(x, T).

a. Argue that the average depth of a node in TT is

1nโˆ‘xโˆˆTd(x,T)=1nP(T).\frac{1}{n} \sum_{x \in T} d(x, T) = \frac{1}{n} P(T).

Thus, we wish to show that the expected value of P(T)P(T) is O(nlgโกn)O(n\lg n).

b. Let TLT_L and TRT_R denote the left and right subtrees of tree TT, respectively. Argue that if TT has nn nodes, then

P(T)=P(TL)+P(TR)+nโˆ’1.P(T) = P(T_L) + P(T_R) + n - 1.

c. Let P(n)P(n) denote the average total path length of a randomly built binary search tree with n nodes. Show that

P(n)=1nโˆ‘i=0nโˆ’1(P(i)+P(nโˆ’iโˆ’1)+nโˆ’1).P(n) = \frac{1}{n} \sum_{i = 0}^{n - 1} (P(i) + P(n - i - 1) + n - 1).

d. Show how to rewrite P(n)P(n) as

P(n)=2nโˆ‘k=1nโˆ’1P(k)+ฮ˜(n).P(n) = \frac{2}{n} \sum_{k = 1}^{n - 1} P(k) + \Theta(n).

e. Recalling the alternative analysis of the randomized version of quicksort given in Problem 7-3, conclude that P(n)=O(nlgโกn)P(n) = O(n\lg n). At each recursive invocation of quicksort, we choose a random pivot element to partition the set of elements being sorted. Each node of a binary search tree partitions the set of elements that fall into the subtree rooted at that node.

f. Describe an implementation of quicksort in which the comparisons to sort a set of elements are exactly the same as the comparisons to insert the elements into a binary search tree. (The order in which comparisons are made may differ, but the same comparisons must occur.)

(Removed)