17-2 Making binary search dynamic

Binary search of a sorted array takes logarithmic search time, but the time to insert a new element is linear in the size of the array. We can improve the time for insertion by keeping several sorted arrays.

Specifically, suppose that we wish to support SEARCH\text{SEARCH} and INSERT\text{INSERT} on a set of nn elements. Let k=โŒˆlgโก(n+1)โŒ‰k = \lceil \lg(n + 1) \rceil, and let the binary representation of nn be โŸจnkโˆ’1,nkโˆ’2,โ€ฆ,n0โŸฉ\langle n_{k - 1}, n_{k - 2}, \ldots, n_0 \rangle. We have kk sorted arrays A0,A1,โ€ฆ,Akโˆ’1A_0, A_1, \ldots, A_{k - 1}, where for i=0,1,โ€ฆ,kโˆ’1i = 0, 1, \ldots, k - 1, the length of array AiA_i is 2i2^i. Each array is either full or empty, depending on whether ni=1n_i = 1 or ni=0n_i = 0, respectively. The total number of elements held in all kk arrays is therefore โˆ‘i=0kโˆ’1ni2i=n\sum_{i = 0}^{k - 1} n_i 2^i = n. Although each individual array is sorted, elements in different arrays bear no particular relationship to each other.

a. Describe how to perform the SEARCH\text{SEARCH} operation for this data structure. Analyze its worst-case running time.

b. Describe how to perform the INSERT\text{INSERT} operation. Analyze its worst-case and amortized running times.

c. Discuss how to implement DELETE\text{DELETE}.

a. We linearly go through the lists and binary search each one since we don't know the relationship between one list an another. In the worst case, every list is actually used. Since list ii has length 2i2^i and it's sorted, we can search it in O(i)O(i) time. Since ii varies from 00 to O(lgโกn)O(\lg n), the runtime of SEARCH is O(lgโก2n)O(\lg^2 n).

b. To insert, we put the new element into A0A_0 and update the lists accordingly. In the worst case, we must combine lists A0,A1,โ€ฆ,Amโˆ’1A_0, A_1, \dots, A_{m โˆ’ 1} into list Am. Since merging two sorted lists can be done linearly in the total length of the lists, the time this takes is O(2m)O(2^m). In the worst case, this takes time O(n)O(n) since mm could equal kk.

We'll use the accounting method to analyse the amoritized cost. Assign a cost of lgโกn\lg n to each insertion. Thus, each item carries lgโกn\lg n credit to pay for its later merges as additional items are inserted. Since an individual item can only be merged into a larger list and there are only lgโกn\lg n lists, the credit pays for all future costs the item might incur. Thus, the amortized cost is O(lgโกn)O(\lg n).

c. Find the smallest mm such that nmโ‰ 0n_m \ne 0 in the binary representation of nn. If the item to be deleted is not in list AmA_m, remove it from its list and swap in an item from Am, arbitrarily. This can be done in O(lgโกn)O(\lg n) time since we may need to search list AkA_k to find the element to be deleted. Now simply break list AmA_m into lists A0,A1,โ€ฆ,Amโˆ’1A_0, A_1, \dots, A_{m โˆ’ 1} by index. Since the lists are already sorted, the runtime comes entirely from making the splits, which takes O(m)O(m) time. In the worst case, this is O(lgโกn)O(\lg n).