21-1 Off-line minimum

The off-line minimum problem asks us to maintain a dynamic set TT of elements from the domain {1,2,,n}\{1, 2, \ldots, n\} under the operations INSERT\text{INSERT} and EXTRACT-MIN\text{EXTRACT-MIN}. We are given a sequence SS of nn INSERT\text{INSERT} and mm EXTRACT-MIN\text{EXTRACT-MIN} calls, where each key in {1,2,,n}\{1, 2, \ldots, n\} is inserted exactly once. We wish to determine which key is returned by each EXTRACT-MIN\text{EXTRACT-MIN} call. Specifically, we wish to fill in an array extracted[1..m]extracted[1..m], where for i=1,2,,mi = 1, 2, \ldots, m, extracted[i]extracted[i] is the key returned by the iith EXTRACT-MIN\text{EXTRACT-MIN} call. The problem is "off-line" in the sense that we are allowed to process the entire sequence SS before determining any of the returned keys.

a. In the following instance of the off-line minimum problem, each operation INSERT(i)\text{INSERT}(i) is represented by the value of ii and each EXTRACT-MIN\text{EXTRACT-MIN} is represented by the letter E\text E:

4,8,E,3,E,9,2,6,E,E,E,1,7,E,5.4, 8, \text E, 3, \text E, 9, 2, 6, \text E, \text E, \text E, 1, 7, \text E, 5.

Fill in the correct values in the extracted array.

To develop an algorithm for this problem, we break the sequence SS into homogeneous subsequences. That is, we represent SS by

I1,E,I2,E,I3,,Im,E,Im+1,\text I_1, \text E, \text I_2, \text E, \text I_3, \ldots, \text I_m,\text E, \text I_{m + 1},

where each E\text E represents a single EXTRACT-MIN\text{EXTRACT-MIN} call and each Ij\text{I}_j represents a (possibly empty) sequence of INSERT\text{INSERT} calls. For each subsequence Ij\text{I}_j , we initially place the keys inserted by these operations into a set KjK_j, which is empty if Ij\text{I}_j is empty. We then do the following:

cpp OFF-LINE-MINIMUM(m, n) for i = 1 to n determine j such that i ∈ K[j] if j != m + 1 extracted[j] = i let l be the smallest value greater than j for which set K[l] exists K[l] = K[j] ∪ K[l], destroying K[j] return extracted

b. Argue that the array extracted returned by OFF-LINE-MINIMUM\text{OFF-LINE-MINIMUM} is correct.

c. Describe how to implement OFF-LINE-MINIMUM\text{OFF-LINE-MINIMUM} efficiently with a disjoint-set data structure. Give a tight bound on the worst-case running time of your implementation.

a.

indexvalue142332465861 \begin{array}{|c|c|} \hline index & value \\ \hline 1 & 4 \\ 2 & 3 \\ 3 & 2 \\ 4 & 6 \\ 5 & 8 \\ 6 & 1 \\ \hline \end{array}

b. As we run the for loop, we are picking off the smallest of the possible elements to be removed, knowing for sure that it will be removed by the next unused EXTRACT-MIN\text{EXTRACT-MIN} operation. Then, since that EXTRACT-MIN\text{EXTRACT-MIN} operation is used up, we can pretend that it no longer exists, and combine the set of things that were inserted by that segment with those inserted by the next, since we know that the EXTRACT-MIN\text{EXTRACT-MIN} operation that had separated the two is now used up. Since we proceed to figure out what the various extract operations do one at a time, by the time we are done, we have figured them all out.

c. We let each of the sets be represented by a disjoint set structure. To union them (as on line 6) just call UNION\text{UNION}. Checking that they exist is just a matter of keeping track of a linked list of which ones exist(needed for line 5), initially containing all of them, but then, when deleting the set on line 6, we delete it from the linked list that we were maintaining. The only other interaction with the sets that we have to worry about is on line 2, which just amounts to a call of FIND-SET(j)\text{FIND-SET}(j). Since line 2 takes amortized time α(n)\alpha(n) and we call it exactly nn times, then, since the rest of the for loop only takes constant time, the total runtime is O(nα(n))O(n\alpha(n)).