5.3 Randomized algorithms
5.3-1
Professor Marceau objects to the loop invariant used in the proof of Lemma 5.5. He questions whether it is true prior to the first iteration. He reasons that we could just as easily declare that an empty subarray contains no $0$-permutations. Therefore, the probability that an empty subarray contains a $0$-permutation should be $0$, thus invalidating the loop invariant prior to the first iteration. Rewrite the procedure $\text{RANDOMIZE-IN-PLACE}$ so that its associated loop invariant applies to a nonempty subarray prior to the first iteration, and modify the proof of Lemma 5.5 for your procedure.
Modify the algorithm by unrolling the $i = 1$ case.
swap(A[1], A[RANDOM(1, n)])
for i = 2 to n
swap(A[i], A[RANDOM(i, n)])
Modify the proof of Lemma 5.5 by starting with $i = 2$ instead of $i = 1$. This resolves the issue of $0$-permutations.
5.3-2
Professor Kelp decides to write a procedure that produces at random any permutation besides the identity permutation. He proposes the following procedure:
cpp PERMUTE-WITHOUT-IDENTITY(A) n = A.length for i = 1 to n - 1 swap A[i] with A[RANDOM(i + 1, n)]
Does this code do what Professor Kelp intends?
The code does not do what he intends. Suppose $A = [1, 2, 3]$. If the algorithm worked as proposed, then with nonzero probability the algorithm should output $[3, 2, 1]$. On the first iteration we swap $A[1]$ with either $A[2]$ or $A[3]$. Since we want $[3, 2, 1]$ and will never again alter $A[1]$, we must necessarily swap with $A[3]$. Now the current array is $[3, 2, 1]$. On the second (and final) iteration, we have no choice but to swap $A[2]$ with $A[3]$, so the resulting array is $[3, 1, 2]$. Thus, the procedure cannot possibly be producing random non-identity permutations.
5.3-3
Suppose that instead of swapping element $A[i]$ with a random element from the subarray $A[i..n]$, we swapped it with a random element from anywhere in the array:
cpp PERMUTE-WITH-ALL(A) n = A.length for i = 1 to n swap A[i] with A[RANDOM(1, n)]
Does this code produce a uniform random permutation? Why or why not?
Consider the case of $n = 3$ in running the algorithm, three IID choices will be made, and so you'll end up having $27$ possible end states each with equal probability. There are $3! = 6$ possible orderings, these should appear equally often, but this can't happen because $6$ does not divide $27$.
5.3-4
Professor Armstrong suggests the following procedure for generating a uniform random permutation:
cpp PERMUTE-BY-CYCLIC(A) n = A.length let B[1..n] be a new array offset = RANDOM(1, n) for i = 1 to n dest = i + offset if dest > n dest = dest - n B[dest] = A[i] return B
Show that each element $A[i]$ has a $1 / n$ probability of winding up in any particular position in $B$. Then show that Professor Armstrong is mistaken by showing that the resulting permutation is not uniformly random.
Fix a position $j$ and an index $i$. We'll show that the probability that $A[i]$ winds up in position $j$ is $1 / n$. The probability $B[j] = A[i]$ is the probability that $dest = j$, which is the probability that $i + offset$ or $i + offset − n$ is equal to $j$, which is $1 / n$.
This algorithm can't possibly return a random permutation because it doesn't change the relative positions of the elements; it merely cyclically permutes the whole permutation.
For instance, suppose $A = [1, 2, 3]$,
- if $offset = 1$, $B = [3, 1, 2]$,
- if $offset = 2$, $B = [2, 3, 1]$,
- if $v = 3$, $B = [1, 2, 3]$.
Thus, the algorithm will never produce $B = [1, 3, 2]$, so the resulting permutation cannot be uniformly random.
5.3-5 $\star$
Prove that in the array $P$ in procedure $\text{PERMUTE-BY-SORTING}$, the probability that all elements are unique is at least $1 - 1 / n$.
Let $\Pr\{j\}$ be the probability that the element with index $j$ is unique. If there are $n^3$ elements, then the $\Pr\{j\} = 1 - \frac{j - 1}{n^3}$.
$$ \begin{aligned} \Pr\{1 \cap 2 \cap 3 \cap \ldots\} & = \Pr\{1\} \cdot \Pr\{2 \mid 1\} \cdot \Pr\{3 \mid 1 \cap 2\} \cdots \\ & = 1 (1 - \frac{1}{n^3})(1 - \frac{2}{n^3})(1 - \frac{3}{n^3}) \cdots \\ & \ge 1 (1 - \frac{n}{n^3}) (1 - \frac{n}{n^3})(1 - \frac{n}{n^3}) \cdots \\ & \ge (1 - \frac{1}{n^2})^n \\ & \ge 1 - \frac{1}{n}, \\ \end{aligned} $$
where the last step holds for $(1 - x)^n \ge 1 - nx$.
5.3-6
Explain how to implement the algorithm $\text{PERMUTE-BY-SORTING}$ to handle the case in which two or more priorities are identical. That is, your algorithm should produce a uniform random permutation, even if two or more priorities are identical.
PERMUTE-BY-SORTING(A)
let P[1..n] be a new array
for i = 1 to n
P[i] = i
for i = 1 to n
swap P[i] with P[RANDOM(i, n)]
5.3-7
Suppose we want to create a random sample of the set $\{1, 2, 3, \ldots, n\}$, that is, an $m$-element subset $S$, where $0 \le m \le n$, such that each $m$-subset is equally likely to be created. One way would be to set $A[i] = i$ for $i = 1, 2, 3, \ldots, n$, call $\text{RANDOMIZE-IN-PLACE}(A)$, and then take just the first $m$ array elements. This method would make $n$ calls to the $\text{RANDOM}$ procedure. If $n$ is much larger than $m$, we can create a random sample with fewer calls to $\text{RANDOM}$. Show that the following recursive procedure returns a random $m$-subset $S$ of $\{1, 2, 3, \ldots, n\}$, in which each $m$-subset is equally likely, while making only $m$ calls to $\text{RANDOM}$:
cpp RANDOM-SAMPLE(m, n) if m == 0 return Ø else S = RANDOM-SAMPLE(m - 1, n - 1) i = RANDOM(1, n) if i ∈ S S = S ∪ {n} else S = S ∪ {i} return S
We prove that it produces a random $m$ subset by induction on $m$. It is obviously true if $m = 0$ as there is only one size $m$ subset of $[n]$. Suppose $S$ is a uniform $m − 1$ subset of $n − 1$, that is, $\forall j \in [n - 1]$, $\Pr[j \in S] = \frac{m - 1}{n - 1}$.
If we let $S'$ denote the returned set, suppose first $j \in [n − 1]$,
$$ \begin{aligned} \Pr[j \in S'] & = \Pr[j \in S] + \Pr[j \notin S \wedge i = j] \\ & = \frac{m - 1}{n - 1} + \Pr[j \notin S]\Pr[i = j] \\ & = \frac{m - 1}{n - 1} + \left(1 - \frac{m - 1}{n - 1}\right) \frac{1}{n} \\ & = \frac{n(m - 1) + n - m}{(n - 1)n} \\ & = \frac{nm - m}{(n - 1)n} = \frac{m}{n}. \end{aligned} $$
Since the constructed subset contains each of $[n − 1]$ with the correct probability, it must also contain $n$ with the correct probability because the probabilities sum to $1$.