Chapter 1

Exercise 1.1

We start by proving that $f(N)$ is asymptotically equivalent to $N\lg N$ . For sufficiently large $N$ , the $\lg N$ term dominates the constant hidden in $O(N)$ . Therefore, $\lim_{N \to \infty} |f(N)/(N\lg N)|=|1+O(N)/(N\lg N)|=1$ implying $f(N)=\Theta(N\log N)$ . Notice that we can ignore the base of a logarithm in the $\Theta$ notation.

Exercise 1.2

The book already mentions that there are various strategies to avoid sentinels. For example, on the Algorithms, 4th Edition by RS, KW book’s website, you can find an interesting approach (see the section Creative Problems), that uses a single auxiliary array aux[]. As stated there: "The merge procedure copies the second half of a[] to aux[] in decreasing order and then does the merge back to a[]." This improves performance by eschewing boundary checks on indices. Nonetheless, such a mergesort isn’t stable anymore, which is a key property of this sorting algorithm.

On the same website, you have Merge.java; a standard implementation of the top-down mergesort algorithm without relying on sentinels. As a bonus, it also offers index-based sorting, which avoids moving elements around. Shuffling keys could be expensive if they’re associated with satellite data.

Exercise 1.3

GeeksforGeeks covers 3-way mergesort in great detail.

Below is the driver code to measure execution time. This should replace the original driver code in programs from GeeksforGeeks implementing standard and 3-way mergesort algorithms.

if __name__ == "__main__":
    import random
    import time
    import gc
    import sys

    # safety: increase recursion limit
    sys.setrecursionlimit(100_000)

    TRIALS = 10
    N = 500_000

    times = []

    for t in range(TRIALS):
        trial_num = t + 1
        print(f"Trial {trial_num}/{TRIALS}: generating random list of {N} integers...")
        random.seed(42 + t)  # deterministic per-trial seed for reproducibility
        try:
            arr = [random.randint(0, 10**9) for _ in range(N)]
        except MemoryError:
            print("MemoryError while allocating the list. Reduce N or run on a machine with more memory.")
            break

        gc.collect()
        print(f"Trial {trial_num}: starting...")
        t0 = time.perf_counter()
        try:
            # Change this line to call a different target function.
            three_way_merge_sort(arr, 0, len(arr) - 1)
        except RecursionError as e:
            print(f"Trial {trial_num} failed with RecursionError: {e}")
            break
        except Exception as e:
            print(f"Trial {trial_num} failed with exception: {e}")
            break
        t1 = time.perf_counter()

        elapsed = t1 - t0
        times.append(elapsed)
        print(f"Trial {trial_num} finished in {elapsed:.3f} seconds.")

        del arr
        gc.collect()

    if times:
        avg = sum(times) / len(times)
        print(f"\nRan {len(times)} successful trials. Average: {avg:.3f}s, Min: {min(times):.3f}s, Max: {max(times):.3f}s")

Here are the outputs created on a MacBook Pro 2019 machine:

--- standard mergesort ---
Ran 10 successful trials. Average: 2.260s, Min: 2.152s, Max: 2.426s

--- 3-way mergesort ---
Ran 10 successful trials. Average: 2.149s, Min: 2.031s, Max: 2.456s

3-way mergesort has a lower recursion depth, which may potentially improve performance. It’s a better choice in external sorting scenarios, since it makes fewer passes over datasets typically stored on slow external mediums.

🌟 Exercise 1.4

This exercise introduces a neat trick for indirectly solving recurrences. Often, we can develop a recurrence describing the difference between two consecutive terms, and employ it to solve or prove some property of the original formula.

Using the recurrence (1) from the book, we have

C_{N+1}-C_N=C_{\lceil (N+1)/2 \rceil}+C_{\lfloor (N+1)/2 \rfloor}+(N+1)-C_{\lceil N/2 \rceil}-C_{\lfloor N/2 \rfloor}-N.

Since $\lfloor (N+1)/2 \rfloor=\lceil N/2 \rceil$ and $\lceil (N+1)/2 \rceil=\lfloor N/2 \rfloor +1$ we can simplify the previous equation and get

C_{N+1}-C_N=C_{\lfloor N/2 \rfloor +1}-C_{\lfloor N/2 \rfloor}+1.

The quantity $\lfloor N/2 \rfloor$ is a positive integer, so we can apply recursively the above expansion by leveraging a useful property of floor and ceiling functions that $\lfloor\lfloor x\rfloor/2 \rfloor=\lfloor x/2 \rfloor$ . We can carry out this process until the problem size drops to 1, recalling that $C_1=0$ . Thus, we get

\begin{align*} C_{N+1}-C_N &= C_{\lfloor N/2 \rfloor +1}-C_{\lfloor N/2 \rfloor}+1 \\ &= C_{\lfloor N/4 \rfloor +1}-C_{\lfloor N/4 \rfloor}+1+1 \\ &\dots \\ &=C_2-C_1+\underbrace{1+1+\dots+1}_{\lfloor \lg N \rfloor \text{ times}} \\ &= 2+\lfloor \lg N \rfloor. \end{align*}

Let’s express $C_N$ as a sum of differences, so we have

\begin{align*} C_N &= (C_N-C_{N-1})+ (C_{N-1} - C_{N-2}) + \dots+ (C_2-C_1) \\ &=(2+\lfloor\lg (N-1)\rfloor)+(2+\lfloor\lg (N-2)\rfloor)+\dots+(2+\lfloor\lg 1\rfloor) \\ &=\sum_{1 \le k <N} (\lfloor\lg k\rfloor+2). \end{align*}

🌟 Exercise 1.5

This exercise showcases an advanced usage of mathematical induction and techniques to manipulate floors and ceilings.

As mentioned in the book, the theorem for a general $N$ follows from (1) by induction. Let’s use the equation $C_N=N \lceil \lg N \rceil+N-2^{\lceil \lg N \rceil}$ as the inductive hypothesis. The base case $C_1=0$ trivially holds, since $\lg 1=0$ . For $N>1$ we have

\begin{align*} C_N&=\lceil N/2 \rceil \lceil \lg {\lceil N/2 \rceil} \rceil+\lceil N/2 \rceil-2^{\lceil \lg {\lceil N/2 \rceil} \rceil} \\ &\hspace{1cm} + \lfloor N/2 \rfloor \lceil \lg {\lfloor N/2 \rfloor} \rceil+\lfloor N/2 \rfloor-2^{\lceil \lg {\lfloor N/2 \rfloor} \rceil} && \text{(by the inductive hypothesis)} \\ &\hspace{1cm} +N \\ &= \lceil N/2 \rceil \lceil \lg {\lceil N/2 \rceil} \rceil+\lfloor N/2 \rfloor \lceil \lg {\lfloor N/2 \rfloor} \rceil+2N-2^{\lceil \lg {\lceil N/2 \rceil} \rceil}-2^{\lceil \lg {\lfloor N/2 \rfloor} \rceil} && \text{($N=\lceil N/2 \rceil+\lfloor N/2 \rfloor$)}. \end{align*}

If $N$ is even, then $\lceil N/2 \rceil=\lfloor N/2 \rfloor=N/2$ , so the last line simplifies to

\begin{align*} C_N&=2 \cdot (N/2) \lceil \lg {(N/2)} \rceil+2N- 2\cdot 2^{\lceil \lg {(N/2) } \rceil} \\ &=N \lceil \lg N \rceil-N+2N-2^{\lceil \lg N \rceil} && \text{($\lceil \lg {(N/2)} \rceil=\lceil \lg N-1 \rceil=\lceil \lg N \rceil-1$)} \\ &=N \lceil \lg N \rceil+N-2^{\lceil \lg N \rceil}. && \checkmark \end{align*}

If $N$ is odd, then $\lceil N/2 \rceil=(N+1)/2$ and $\lfloor N/2 \rfloor=(N-1)/2$ , so we get

\begin{align*} C_N&=\frac{N+1}{2} \bigg\lceil \lg \frac{N+1}{2} \bigg\rceil+\frac{N-1}{2} \bigg\lceil \lg \frac{N-1}{2} \bigg\rceil+2N-2^{\big\lceil \lg \frac{N+1}{2} \big\rceil}-2^{\big\lceil \lg \frac{N-1}{2} \big\rceil} \\[0.4cm] &=\frac{N+1}{2} \lceil \lg (N+1) \rceil + \frac{N-1}{2} \lceil \lg (N-1) \rceil-N+2N-2^{\lceil \lg (N+1) \rceil-1}-2^{\lceil \lg (N-1) \rceil-1} \\[0.4cm] &=\frac{N+1}{2} \lceil \lg (N+1) \rceil + \frac{N-1}{2} \lceil \lg (N-1) \rceil+N-2^{\lceil \lg (N +1)\rceil-1}-2^{\lceil \lg (N-1) \rceil-1}. \end{align*}

We must figure out how to express $\lceil \lg (N+1) \rceil$ and $\lceil \lg (N-1) \rceil$ . We know that $N \in (2^k,2^{k+1})$ for some positive integer $k$ . Observe that $\lceil \lg (N+1) \rceil=\lceil \lg N\rceil=k+1$ . If $N>2^k+1$ then $\lceil \lg (N-1) \rceil=\lceil \lg N \rceil=k+1$ , otherwise $\lceil \lg (N-1) \rceil=\lceil \lg N \rceil-1=k$ . Therefore, we must differentiate these two sub-cases.

When $N>2^k+1$ we immediately get

\begin{align*} C_N &=\frac{N+1}{2} \lceil \lg N \rceil + \frac{N-1}{2} \lceil \lg N \rceil+N-2^{\lceil \lg N \rceil-1}-2^{\lceil \lg N \rceil-1} \\ &=N \lceil \lg N \rceil +N-2 \cdot 2^{\lceil \lg N \rceil-1} \\ &= N \lceil \lg N \rceil +N-2^{\lceil \lg N \rceil}. \space \space\space\space\checkmark \end{align*}

If $N=2^k+1$ observe that $(N-1)/2=(2^k+1-1)/2=2^{k-1}=2^{\lceil \lg N \rceil-2}$ . Thus, we have

\begin{align*} C_N &=\frac{N+1}{2} \lceil \lg N \rceil + \frac{N-1}{2} \lceil \lg N \rceil-2^{\lceil \lg N \rceil-2}+N-2^{\lceil \lg N \rceil-1}-2^{\lceil \lg N \rceil-2} \\ &=N \lceil \lg N \rceil +N-2^{\lceil \lg N \rceil-1}-2 \cdot 2^{\lceil \lg N \rceil-2} \\ &=N \lceil \lg N \rceil +N-2 \cdot 2^{\lceil \lg N \rceil-1} \\ &=N \lceil \lg N \rceil +N-2^{\lceil \lg N \rceil}. \space \space\space\space\checkmark \end{align*}

All in all, we’ve proved that the inductive hypothesis is true for all $N \ge1$ .

Exercise 1.6

We use the following three-way merge procedure in Java instead of the original from Exercise 1.3. It uses sentinels, so counting the number of comparisons inside the loop is straightforward.

static void merge(int arr[], int left, int mid1, int mid2, int right) {  
    // Sizes of three subarrays
    int size1 = mid1 - left + 1;
    int size2 = mid2 - mid1;
    int size3 = right - mid2;
        
    // Temporary arrays for three parts
    int[] leftArr = new int[size1 + 1];
    int[] midArr = new int[size2 + 1];
    int[] rightArr = new int[size3 + 1];
        
    // Copy data to temporary arrays
    for (int i = 0; i < size1; i++) {
        leftArr[i] = arr[left + i];
    }
    for (int i = 0; i < size2; i++) {
        midArr[i] = arr[mid1 + 1 + i];
    }
    for (int i = 0; i < size3; i++) {
        rightArr[i] = arr[mid2 + 1 + i];
    }
      
    // Add sentinels
    leftArr[size1] = INFTY;
    midArr[size2] = INFTY;
    rightArr[size3] = INFTY;
        
    // Merge three sorted subarrays
    for (int index = left, i = 0, j = 0, k = 0; index <= right; index++) {
        // Find the smallest among the three current elements
        // and place it into the merged array
        if (leftArr[i] <= midArr[j]) {
            if (leftArr[i] <= rightArr[k]) {
                arr[index] = leftArr[i++];	
            } else {
                arr[index] = rightArr[k++];
            }
        } else if (midArr[j] <= rightArr[k]) {
            arr[index] = midArr[j++];
        } else {
            arr[index] = rightArr[k++];
        }
    }
}

Each iteration of the main for loop performs 2 comparisons to decipher the minimum among the 3 candidates. The recurrence is

C_N=3C_{N/3 }+2N \space\space \text{for $N \ge 2$ with $C_1=C_0=0$.}

If we consider the case when $N$ is a power of 3, and use the same analysis as in the book for the standard mergesort, then we get $C_N=2N\log_3 N$ .

🌟 Exercise 1.7

Algorithm science is a methodology for developing machine-independent falsifiable hypotheses. This exercise is about the doubling hypothesis. According to Exercise 1.1 we know that $t_N=cN \lg N+dN \sim cN \lg N$ . We can estimate $t_{2N}$ , independent of the machine, as follows:

\begin{align*} t_{2N} &\approx 2cN\lg {2N} \\ &=2cN+2cN\lg N \\ &\approx2cN+2t_N \\ &\approx2t_N(1+1/\lg N) && \text{($cN\approx t_N/\lg N$)}. \end{align*}

The next Python script shows how well our estimated $t_{2N}$ reflects the actual $t_{2N}$ . We can run different experiments by altering the machine-dependent constants $c$ and $d$ .

from math import log

# Machine dependent constants in the simulation.
c = 3.5
d = 7.1

def t(n):
    return c * n * log(n, 2) + d * n

def estimate_t_2n(t_n, n):
    return 2 * t_n * (1 + 1 / log(n, 2))

for n in [10, 100, 1000, 10000]:
    t_n = t(n)
    est_t_2n = estimate_t_2n(t_n, n)
    act_t_2n = t(2 * n)
    print(f"n: {n:6},\tEstimated: {est_t_2n:10.2f},\tActual: {act_t_2n:10.2f},\tRel. Error: {abs(est_t_2n - act_t_2n) / act_t_2n:.2%}")

The output shows that the relative error drops quickly as $N$ increases.

n:     10,	Estimated:     487.28,	Actual:     444.53,	Rel. Error: 9.62%
n:    100,	Estimated:    6984.43,	Actual:    6770.70,	Rel. Error: 3.16%
n:   1000,	Estimated:   92385.37,	Actual:   90960.49,	Rel. Error: 1.57%
n:  10000,	Estimated: 1152826.43,	Actual: 1142139.87,	Rel. Error: 0.94%

Exercise 1.8

Let’s use the Python implementation of mergesort from GeeksforGeeks and replace the driver code with the following snippet:

# Driver code
import time
from math import log
from random import randint

if __name__ == "__main__":
    for n in (1_000_000, 10_000_000):
        arr = [randint(0, 100) for _ in range(n)]

        start_time = time.time()
        mergeSort(arr, 0, len(arr) - 1)
        end_time = time.time()
        t_n = end_time - start_time

        if n == 1_000_000:
            print(f"Actual 1x time: {t_n:.2f} seconds")
            t10x_estimate = 10 * t_n * (1 + log(10, 2) / log(n, 2))
            print(f"Estimated 10x time: {t10x_estimate:.2f} seconds")
        else:
            print(f"Actual 10x time: {t_n:.2f} seconds")
            print(f"Relative Error: {abs(t_n - t10x_estimate) / t_n:.2%}")

Here’s an output on MacBook Pro 2019:

Actual 1x time: 4.05 seconds
Estimated 10x time: 47.20 seconds
Actual 10x time: 45.84 seconds
Relative Error: 2.97%

Apparently, the estimated 10x time closely matches the actual 10x running time.

Exercise 1.9

The program Sort2distinct.java implements this algorithm in Java. Actually, it runs the 3-way partitioning procedure.

The inline comment for the sort method says that the code assumes the input has at most 3 distinct values. Without any preprocessing, this should be 2 (see Exercise 1.10).

Exercise 1.10

Perform an initial linear scan to record one set of indices $i$ , $j$ and $k$ for those three distinct values. Determine the index of the median and exchange it with an element at the beginning of an array. Finally, run the 3-way partitioning procedure from Exercise 1.9.

As an additional practice, solve the LeetCode problem Sort Colors.

🌟 Exercise 1.11

Number of compares is a reliable estimator of runtime performance of comparison-based sorting algorithms. The Python script below tests the hypothesis that the running time of mergesort divided by the number of compares that it uses approaches a constant as the problem size increases. The merge procedure uses sentinels to be aligned with the version from the book.

def merge(a, left, mid, right):
    n1 = mid - left + 1
    n2 = right - mid

    # Create temp arrays
    b = [0] * n1
    c = [0] * n2

    # Copy data to temp arrays b[] and c[]
    for i in range(n1):
        b[i] = a[left + i]
    for j in range(n2):
        c[j] = a[mid + 1 + j]

    # Add sentinels
    b.append(float('inf'))
    c.append(float('inf'))

    # Merge the temp arrays back into a[left..right]
    i = j = 0
    for k in range(left, right + 1):
        if b[i] <= c[j]:
            a[k] = b[i]
            i += 1
        else:
            a[k] = c[j]
            j += 1

def merge_sort(a, left, right):
    if left < right:
        mid = (left + right) // 2
        merge_sort(a, left, mid)
        merge_sort(a, mid + 1, right)
        merge(a, left, mid, right)

# Reporting code
import time
from math import log
from random import randint

if __name__ == "__main__":
    print(f"{'n':>7} {'t(n)':>11} {'c(n)':>12} {'t(n)/c(n)':>10}")
    print("-" * 44)
    sz = 1
    for n in range(20):
        sz <<= 1
        a = [randint(0, 100) for _ in range(sz)]

        start_time = time.time()
        merge_sort(a, 0, len(a) - 1)
        end_time = time.time()
        t_n = (end_time - start_time) * 1_000_000
        c_n = sz * log(sz, 2)
        print(f"{sz:>7} {t_n:>11.2f} {c_n:>12.2f} {t_n / c_n:>10.2f}")

The running time is shown in microseconds. The ratio $t_N/c_N$ approaches a constant, as the problem size increases; it’s always a power 2, so the values of $c_N$ are exact. Interestingly, a similar ratio $\approx 0.2$ occurred both on MacBook Pro 2019 and HP laptop running Windows 11.

      n        t(n)         c(n)  t(n)/c(n)
--------------------------------------------
      2        7.87         2.00       3.93
      4       10.97         8.00       1.37
      8       14.07        24.00       0.59
     16       29.33        64.00       0.46
     32       60.08       160.00       0.38
     64      126.84       384.00       0.33
    128      268.22       896.00       0.30
    256      565.05      2048.00       0.28
    512     1359.70      4608.00       0.30
   1024     2971.89     10240.00       0.29
   2048     6124.02     22528.00       0.27
   4096    11742.83     49152.00       0.24
   8192    25410.18    106496.00       0.24
  16384    53179.98    229376.00       0.23
  32768   108177.90    491520.00       0.22
  65536   229867.22   1048576.00       0.22
 131072   469205.86   2228224.00       0.21
 262144   982743.98   4718592.00       0.21
 524288  2103214.74   9961472.00       0.21
1048576  4166778.09  20971520.00       0.20

Exercise 1.12

Let $T_N$ denote the cumulative (total) number of compares used by quicksort on all $N!$ permutations of $N$ elements. The average number of comparisons is $C_N=T_N/N!$ , so the base cases are $T_2=6$ and $T_1=T_0=0$ . We can derive $T_N$ based upon the simplified recurrence of $C_N$ from the book for $N>2$ .

\begin{align*} NC_N&=(N+1)C_{N-1}+2N \\[0.2cm] N\left(\frac{T_N}{N!}\right) &= (N+1)\left(\frac{T_{N-1}}{(N-1)!}\right)+2N \\[0.4cm] \frac{T_N}{(N-1)!} &= (N+1)\frac{T_{N-1}}{(N-1)!}+2N \\[0.4cm] \therefore T_N &= (N+1)T_{N-1}+2N(N-1)! \\ &= (N+1)T_{N-1}+2N!. \end{align*}

🌟 Exercise 1.13

The invariant that the subarrays left after partitioning a random permutation are themselves both random permutations is quintessential for analysis and performance. The proof below is adopted from Sedgewick’s PhD thesis.

Observe that Program 1.2 only moves the partitioning element (pivot) $v$ to its final position at the end. Leaving the pivot out of the partitioning process is crucial to ensure that the invariant remains true during the whole process.

Suppose that a permutation of $\{1,2,\dots,N\}$ is being partitioned with all such $N!$ permutations equally likely. Choose a pivot $1 \le v \le N$ , so that after partitioning the size of the left subarray becomes $v-1$ . Any rearrangement of these $v-1$ elements, out of the possible $(v-1)!$ , in the original permutation is mapped to a unique permutation in the left subarray. In other words, we can establish a bijection between these two objects. Since all the original permutations are equally likely, the same is true for permutations in the left subarray. The analogous reasoning applies, independently, for the right subarray. This concludes the proof.

Moving the pivot during the partitioning process, by setting $j:=hi+1$ , introduces bias into the subarrays (unless the pivot already happens to be in its final position). The probability of having a permutation where the pivot remains in its initial position is quite low. In our case, the first element from the left larger than the pivot will trigger it to move; it tends to fall near the beginning of the left subarray. Therefore, not all positions are equally likely to contain $v$ .

Program 1.2 even becomes incorrect if j is set to hi+1; the line that exchanges a[i] with a[hi] makes no sense anymore, since the pivot, most probably, were already moved somewhere else.

🌟 Exercise 1.14

We follow the technique from the book to eliminate summation on the RHS. We have

\begin{align*} A_N&=1+\frac{2}{N}\sum_{1\le j \le N}A_{j-1} && \text{(for $N>0$)} \\ NA_N&=N+2\sum_{1\le j \le N}A_{j-1} \\ NA_N-(N-1)A_{N-1}&=1+2A_{N-1} && \text{(for $N>1$, after eliminating summation)} \\ NA_N&=(N+1)A_{N-1}+1 \\ \frac{A_N}{N+1}&=\frac{A_{N-1}}{N}+\frac{1}{(N+1)N} \\ \frac{A_N+1}{N+1}&=\frac{A_{N-1}+1}{N} && \text{$\left(\frac{1}{(N+1)N}=\frac{1}{N}-\frac{1}{N+1}\right)$} \\ \frac{A_N+1}{N+1}&=\frac{A_{N-1}+1}{N}=\dots=\frac{A_1+1}{2} && \text{(iterating)} \\ &\therefore \boxed{A_N=N}. \end{align*}

🌟 Exercise 1.15

The proof is adopted from Sedgewick’s PhD thesis.

WLOG assume that the input array is a random permutation of the set $\{1,2,\dots,N\}$ to make the exposition leaner. Finding the average number of exchanges used during the first partitioning stage (before the pointers cross) is a harder task than deriving $C_N$ or $A_N$ . We have dependencies on both the pivot and arrangement of keys (values to sort). The base cases are $B_1=B_0=0$ and the recurrence is

\begin{align*} B_N=&\frac{1}{N}\sum_{1 \le v\le N} \bigg\{\text{average number of exchanges when the pivot is $v$}\bigg \} \\ &+\frac{2}{N}\sum_{1\le j \le N}B_{j-1} && \text{(for $N>1$)}. \end{align*}

Let’s analyze the first term of the sum by looking at the Program 1.2 from the book. Quicksort is parsimonious regarding exchanges; only those elements are swapped that must be swapped. If the pivot is $v=a[N]$ , then only those keys among $a[v],\dots,a[N-1]$ which are smaller than $v$ are exchanged into the right subarray. Furthermore, each element is exchanged at most once during the whole partitioning process. Let $X_v$ be the random variable denoting the number of exchanges when the pivot is $v$ . The probability mass function of $X_v$ is given by

\Pr\{X_v=k\}=\frac{\dbinom{N-v}{N-v-k}\dbinom{v-1}{k}}{\dbinom{N-1}{N-v}}=\frac{\dbinom{N-v}{k}\dbinom{v-1}{v-1-k}}{\dbinom{N-1}{v-1}}.

The numerator expresses in how many ways we can select $k$ smaller elements from the left subarray and use them to form the right subarray. The denominator is just the total number of possible combinations to form the right subarray. As a side note, the right side of the equation follows from applying the identity $\binom{n}{n}=\binom{n}{n-k}$ and shows symmetry in the problem, i.e., it describes the case when $a[1]$ is chosen as the pivot. At any rate, $k \in [0,v-1]$ .

The average number of exchanges is the expectation of $X_v$ defined as

\begin{align*} \mathbb{E}{[X_v]} &= \sum_{k=0}^{v-1} k\Pr\{X_v=k\} \\[0.5cm] &= \sum_{k=1}^{v-1} {k \cdot \frac{\dbinom{N-v}{k}\dbinom{v-1}{v-1-k}}{\dbinom{N-1}{v-1}} }\\[0.8cm] &= \frac{N-v}{\dbinom{N-1}{v-1}} \sum_{k=1}^{v-1} \binom{N-v-1}{k-1}\binom{v-1}{v-1-k} \\[0.8cm] &= \frac{N-v}{\dbinom{N-1}{v-1}} \sum_{k=0}^{v-2} \binom{N-v-1}{k}\binom{v-1}{v-2-k} \\[0.8cm] &= \frac{N-v}{\dbinom{N-1}{v-1}} \binom{N-2}{v-2} && \text{(by Vandermonde's convolution)} \\[0.8cm] &= \frac{(N-v)(v-1)}{N-1}. \end{align*}

The recurrence for the average number of exchanges is

\begin{align*} B_N=&\frac{1}{N}\sum_{1 \le v\le N} \mathbb{E}{[X_v]}+\frac{2}{N}\sum_{1\le j \le N}B_{j-1} \\ =&\frac{1}{N(N-1)}\sum_{1 \le v\le N} (N-v)(v-1)+\frac{2}{N}\sum_{1\le j \le N}B_{j-1} \\ &= \frac{1}{N(N-1)}\binom{N}{3}+\frac{2}{N}\sum_{1\le j \le N}B_{j-1} \\ &= \bold{\frac{N-2}{6}}+\frac{2}{N}\sum_{1\le j \le N}B_{j-1}. \end{align*}

If we take into account the official errata, then $B_N \neq \frac{1}{6}C_N-\frac{1}{2}A_N.$ Namely, the equation holds only if we assume that the average partitioning cost in $C_N$ is $N+1$ instead of $N+1-1/N$ .

Exercise 1.16

Let $S_k(N)$ denote the number of subarrays of size $k>0$ encountered, on the average, when sorting a random array of size $N$ with quicksort. We have 3 cases:

If $N<k$ then we can’t have a subarray of size $k$ , so $S_k(N)=0$ .
If $N=k$ then $S_k(N)=1$ , since only the initial array is of size $k$ .
If $N>k$ then the initial call is on a subarray whose size is larger than $k$ , hence we don’t count it. We average the counts from the resulting subproblems. The recurrence is

S_k(N)=\frac{2}{N}\sum_{1\le j \le N}S_k(j-1).

Fast forwarding the exact same steps as in Exercise 1.14, we get

\frac{S_k(N)}{N+1}=\frac{S_k(N-1)}{N}=\dots=\frac{S_k(k+1)}{k+2} \space\space\space\space \text{(for $N>k+1$)}.

Observe that we can’t iterate further than $S_k(k+1)$ . Nonetheless, we can calculate it using the main recurrence to get $S_k(k+1)=2/(k+1)$ ; all base cases are zeros except $S_k(k)=1$ . Therefore, we have

S_k(N)=\frac{2(N+1)}{(k+1)(k+2)}\space\space\space\space \text{(for $N>k$)}.

🌟 Exercise 1.17

Fast forwarding the exact same steps as in the proof of Theorem 1.3, we get

\begin{align*} \frac{C_N}{N+1}&=\frac{C_{M+1}}{M+2}+2\sum_{M+3 \le k \le N+1} \frac{1}{k} && \text{(for $N>M+1$)}\\ &=\frac{C_{M+1}}{M+2}+2(H_{N+1}-H_{M+2}). \end{align*}

We can calculate $C_{M+1}$ using the main recurrence

\begin{align*} C_{M+1} &= (M+1)+1+\frac{2}{M+1}\sum_{1 \le j \le M+1} C_{j-1} \\ &= M+2+\frac{2}{M+1}\sum_{0 \le j \le M} \frac{j(j-1)}{4} && \text{(reindexing and substituting $C_j$)} \\ &= M+2+\frac{1}{2(M+1)}\sum_{1 \le j \le M} j(j-1) \\ &= M+2+\frac{1}{2(M+1)}\left(\frac{M(M+1)(2M+1)}{6}-\frac{M(M+1)}{2}\right) \\ &= M+2+\frac{M(M-1)}{6}. \end{align*}

The formula for the average number of compares in quicksort, when using insertion sort to handle small subarrays, is

C_N=(N+1)\left(1+\frac{M(M-1)}{6(M+2)}\right)+2(N+1)(H_{N+1}-H_{M+2}) \space\space\space\space \text{(for $N>M$)}.

If we take into account the official errata, then the average partitioning cost in $C_N$ is $N+1-1/N$ instead of $N+1$ . Thus, the formula changes to

$C_N=(N+1)\left(1-\frac{1}{(M+1)(M+2)}+\frac{M(M-1)}{6(M+2)}\right)+2(N+1)(H_{N+1}-H_{M+2}) \space\space\space\space \text{(for $N>M$)}.$

Exercise 1.18

After ignoring small terms (those significantly less than $N$ ) as well as constant multiples of $N$ (they don’t influence the minimization task), we can arrive at the following approximation

C_N\approx 2N\ln N +\underset{=f(M)}{\undergroup{\left(\frac{M(M-1)}{6(M+2)}-2\ln {M}\right)}}N.

We can use a suitable math tool, like WolframAlpha, to solve the minimization problem as well as plot the graph of $f(M)$ . Issue plot m(m-1)/(6(m+2))-2ln(m) for 1<=m<60 to see its shape.

You can find the minimum by saying m(m-1)/(6(m+2))-2ln(m) minimize for m>1 . It’ll report a global minimum at $m \approx 12.4$ .

You may use a different approximation, like using $\lg (M+2)$ instead of $\lg M$ inside $f(M)$ , but the outcomes would be similar. Also, you may want to try the more accurate expression for $C_N$ (see the remark in Exercise 2.17).

Exercise 1.19

According to the graph from Exercise 1.18 the threshold is $M =49$ .

Exercise 1.20

Below is the Python script that reports how many keys in a file of $N=10^4$ random positive integers less than $M=10^6$ are likely to be equal to some other keys in the file.

import random
import math
from collections import Counter
from statistics import mean, pstdev

def count_colliding_keys(n, m):
    # Generate n random positive integers in [1, m-1].
    arr = [random.randrange(1, m) for _ in range(n)]
    freq = Counter(arr)
    # Count all occurrences of values that appear at least twice.
    return sum(cnt for cnt in freq.values() if cnt >= 2)

def simulate(n=10_000, m=1_000_000, trials=1000, seed=None):
    if seed is not None:
        random.seed(seed)
    results = [count_colliding_keys(n, m) for _ in range(trials)]
    mu = mean(results)
    sigma = pstdev(results)
    ci_half = 1.96 * sigma / math.sqrt(trials)
    analytic = n * (1 - (1 - 1/m) ** (n - 1))
    print(f"n={n}, m={m}, trials={trials}")
    print(f"simulated mean = {mu:.3f}, std = {sigma:.3f}")
    print(f"95% CI for mean = [{mu-ci_half:.3f}, {mu+ci_half:.3f}]")
    print(f"analytic estimate = {analytic:.3f}")

simulate(seed=1)

The output also shows the 95% confidence interval for the mean:

n=10000, m=1000000, trials=1000
simulated mean = 99.542, std = 13.895
95% CI for mean = [98.681, 100.403]
analytic estimate = 99.492

The analytic formula is the same as for the number of people with a shared birthday; replace the number of days $d$ with $M$ .

The program sets the seed of the random number generator. This allows repeatable results even for randomly generated numbers.

Exercise 1.21

Try changing the relative magnitudes of $N$ and $M$ in Exercise 1.20 to see how the number of duplicates increase as $M$ approaches $N$ , especially when $M \ll N$ . The random permutation model assumes that all elements are distinct and all possible $N!$ permutations are equally likely to occur. Nevertheless, even in the presence of many duplicates, Program 1.2 retains good performance. As emphasized in Sedgewick’s paper Quicksort with Equal Keys., the evidence in favor of stopping the scanning pointers on keys equal to the partitioning element in quicksort is overwhelming. This is exactly how partitioning works in Program 1.2.

There are also other schemes to efficiently handle duplicates. The Algorithms, 4th Edition by RS, KW book’s website, has an extensive treatment of quicksort, including variants like fast 3-way and dual-pivot. There’s also background information about entropy optimal sorting, which states the following

Arrays with large numbers of duplicate sort keys arise frequently in applications. In such applications, there’s potential to reduce the time of the sort from linearithmic to linear.

Thus, this is one example where the random permutation model is inaccurate. Of course, the story can go the other way around when partitioning doesn’t account for duplicates. In this case, quicksort may show quadratic runtime.

Exercise 1.22

Essentially, the distribution of keys is "asymptotically" immaterial for mergesort, as it achieves $\Theta(N\lg N)$ runtime in all cases. The sorting numbers specify the worst-case number of comparisons made by mergesort (see also Exercise 1.5). The best case takes about half as many iterations as its worst case. Of course, this last statement doesn’t apply to Program 1.1 (see Exercise 1.23). Thus, something like Table 1.1 isn’t useful to get insight into stochastic fluctuations due to dependence on probabilistic models.

On the other hand, Table 1.1 could validate a model, implementation and/or analysis. It could help in revealing machine-dependent constants in a model (see Exercise 1.11) or pinpoint awkward implementation-related issues, perhaps bugs, if the empirical evidence goes against mathematical predictions. Finally, it may assure us that the model for a new variant of mergesort aligns well with empirical data.

Exercise 1.23

The running time of mergesort as implemented in Program 1.1 depends only on the number of elements in the array being sorted, not on their arrangement. Therefore, the standard deviation in the number of compares is zero, since the input size fully determines the cost.

PreviousRevision History NextChapter 2

Last updated 3 days ago