8-7 The 00-11 sorting lemma and columnsort

A compare-exchange operation on two array elements A[i]A[i] and A[j]A[j], where i<ji < j, has the form

cpp COMPARE-EXCHANGE(A, i, j) if A[i] > A[j] exchange A[i] with A[j]

After the compare-exchange operation, we know that A[i]A[j]A[i] \le A[j].

An oblivious compare-exchange algorithm operates solely by a sequence of prespecified compare-exchange operations. The indices of the positions compared in the sequence must be determined in advance, and although they can depend on the number of elements being sorted, they cannot depend on the values being sorted, nor can they depend on the result of any prior compare-exchange operation. For example, here is insertion sort expressed as an oblivious compare-exchange algorithm:

cpp INSERTION-SORT(A) for j = 2 to A.length for i = j - 1 downto 1 COMPARE-EXCHANGE(A, i, i + 1)

The 0-1 sorting lemma provides a powerful way to prove that an oblivious compare-exchange algorithm produces a sorted result. It states that if an oblivious compare-exchange algorithm correctly sorts all input sequences consisting of only 00s and 11s, then it correctly sorts all inputs containing arbitrary values.

You will prove the 00-11 sorting lemma by proving its contrapositive: if an oblivious compare-exchange algorithm fails to sort an input containing arbitrary values, then it fails to sort some 00-11 input. Assume that an oblivious compare-exchange algorithm X\text X fails to correctly sort the array A[1..n]A[1..n]. Let A[p]A[p] be the smallest value in AA that algorithm X\text X puts into the wrong location, and let A[q]A[q] be the value that algorithm X\text X moves to the location into which A[p]A[p] should have gone. Define an array B[1..n]B[1..n] of 00s and 11s as follows:

B[i]={0if A[i]A[p],1if A[i]>A[p]. B[i] = \begin{cases} 0 & \text{if $A[i] \le A[p]$}, \\ 1 & \text{if $A[i] > A[p]$}. \end{cases}

a. Argue that A[q]>A[p]A[q] > A[p], so that B[p]=0B[p] = 0 and B[q]=1B[q] = 1.

b. To complete the proof of the 00-11 sorting lemma, prove that algorithm X\text X fails to sort array BB correctly.

Now you will use the 00-11 sorting lemma to prove that a particular sorting algorithm works correctly. The algorithm, columnsort, works on a rectangular array of nn elements. The array has rr rows and ss columns (so that n=rsn = rs), subject to three restrictions:

  • rr must be even,
  • ss must be a divisor of rr, and
  • r2s2r \ge 2 s^2.

When columnsort completes, the array is sorted in column-major order: reading down the columns, from left to right, the elements monotonically increase.

Columnsort operates in eight steps, regardless of the value of nn. The odd steps are all the same: sort each column individually. Each even step is a fixed permutation. Here are the steps:

  1. Sort each column.
  2. Transpose the array, but reshape it back to rr rows and ss columns. In other words, turn the leftmost column into the top r/sr / s rows, in order; turn the next column into the next r/sr / s rows, in order; and so on.
  3. Sort each column.
  4. Perform the inverse of the permutation performed in step 2.
  5. Sort each column.
  6. Shift the top half of each column into the bottom half of the same column, and shift the bottom half of each column into the top half of the next column to the right. Leave the top half of the leftmost column empty. Shift the bottom half of the last column into the top half of a new rightmost column, and leave the bottom half of this new column empty.
  7. Sort each column.
  8. Perform the inverse of the permutation performed in step 6.

Figure 8.5 shows an example of the steps of columnsort with r=6r = 6 and s=3s = 3. (Even though this example violates the requirement that r2s2r \ge 2s^2, it happens to work.)

c. Argue that we can treat columnsort as an oblivious compare-exchange algorithm, even if we do not know what sorting method the odd steps use.

Although it might seem hard to believe that columnsort actually sorts, you will use the 00-11 sorting lemma to prove that it does. The 00-11 sorting lemma applies because we can treat columnsort as an oblivious compare-exchange algorithm. A couple of definitions will help you apply the 00-11 sorting lemma. We say that an area of an array is clean if we know that it contains either all 00s or all 11s. Otherwise, the area might contain mixed 00s and 11s, and it is dirty. From here on, assume that the input array contains only 00s and 11s, and that we can treat it as an array with rr rows and ss columns.

d. Prove that after steps 1–3, the array consists of some clean rows of 00s at the top, some clean rows of 11s at the bottom, and at most ss dirty rows between them.

e. Prove that after step 4, the array, read in column - major order, starts with a clean area of 00s, ends with a clean area of 11s, and has a dirty area of at most s2s^2 elements in the middle.

f. Prove that steps 5–8 produce a fully sorted 00-11 output. Conclude that columnsort correctly sorts all inputs containing arbitrary values.

g. Now suppose that ss does not divide rr. Prove that after steps 1–3, the array consists of some clean rows of 00s at the top, some clean rows of 11s at the bottom, and at most 2s12s - 1 dirty rows between them. How large must rr be, compared with ss, for columnsort to correctly sort when ss does not divide rr?

h. Suggest a simple change to step 1 that allows us to maintain the requirement that r2s2r \ge 2s^2 even when ss does not divide rr, and prove that with your change, columnsort correctly sorts.

(Removed)