2-1 Insertion sort on small arrays in merge sort
Although merge sort runs in $\Theta(n\lg n)$ worst-case time and insertion sort runs in $\Theta(n^2)$ worst-case time, the constant factors in insertion sort can make it faster in practice for small problem sizes on many machines. Thus, it makes sense to coarsen the leaves of the recursion by using insertion sort within merge sort when subproblems become sufficiently small. Consider a modification to merge sort in which $n / k$ sublists of length $k$ are sorted using insertion sort and then merged using the standard merging mechanism, where $k$ is a value to be determined.
a. Show that insertion sort can sort the $n / k$ sublists, each of length $k$, in $\Theta(nk)$ worst-case time.
b. Show how to merge the sublists in $\Theta(n\lg(n / k))$ worst-case time.
c. Given that the modified algorithm runs in $\Theta(nk + n\lg(n / k))$ worst-case time, what is the largest value of $k$ as a function of $n$ for which the modified algorithm has the same running time as standard merge sort, in terms of $\Theta$-notation?
d. How should we choose $k$ in practice?
a. The worst-case time to sort a list of length $k$ by insertion sort is $\Theta(k^2)$. Therefore, sorting $n / k$ sublists, each of length $k$ takes $\Theta(k^2 \cdot n / k) = \Theta(nk)$ worst-case time.
b. We have $n / k$ sorted sublists each of length $k$. To merge these $n / k$ sorted sublists to a single sorted list of length $n$, we have to take $2$ sublists at a time and continue to merge them. This will result in $\lg(n / k)$ steps and we compare $n$ elements in each step. Therefore, the worst-case time to merge the sublists is $\Theta(n\lg(n / k))$.
c. The modified algorithm has time complexity as standard merge sort when $\Theta(nk + n\lg(n / k)) = \Theta(n\lg n)$. Assume $k = \Theta(\lg n)$,
$$ \begin{aligned} \Theta(nk + n\lg(n / k)) & = \Theta(nk + n\lg n - n\lg k) \\ & = \Theta(n\lg n + n\lg n - n\lg(\lg n)) \\ & = \Theta(2n\lg n - n\lg(\lg n)) \\ & = \Theta(n\lg n). \end{aligned} $$
d. Choose $k$ be the largest length of sublist on which insertion sort is faster than merge sort.