book-sectionChapter 1

Exercise 1.1

We start by proving that f(N)f(N) is asymptotically equivalent to NlgNN\lg N. For sufficiently large NN, the lgN\lg N term dominates the constant hidden in O(N)O(N). Therefore, limNf(N)/(NlgN)=1+O(N)/(NlgN)=1\lim_{N \to \infty} |f(N)/(N\lg N)|=|1+O(N)/(N\lg N)|=1 implying f(N)=Θ(NlogN)f(N)=\Theta(N\log N).

Exercise 1.2

The book already mentions that there are various strategies to avoid sentinels. For example, on the Algorithms, 4th Edition by RS, KW book’s websitearrow-up-right, you can find an interesting approach (see the section Creative Problems), that uses a single auxiliary array aux[]. As stated there: "The merge procedure copies the second half of a[] to aux[] in decreasing order and then does the merge back to a[]." This improves performance by eschewing boundary checks on indices. Nonetheless, such a mergesort isn’t stable anymore, which is a key property of this sorting algorithm.

On the same website, you have Merge.javaarrow-up-right; a standard implementation of the top-down mergesort algorithm without relying on sentinels. As a bonus, it also offers index-based sorting, which avoids moving elements around. Shuffling keys could be expensive if they’re associated with satellite data.

Exercise 1.3

GeeksforGeeks covers 3-way mergesortarrow-up-right in great detail.

Below is the driver code to measure execution time. This should replace the original driver code in programs from GeeksforGeeks implementing standardarrow-up-right and 3-way mergesort algorithms.

Here are the outputs created on a MacBook Pro 2019 machine:

3-way mergesort has a lower recursion depth, which may potentially improve performance. It’s a better choice in external sortingarrow-up-right scenarios, since it makes fewer passes over datasets typically stored on slow external mediums.

Exercise 1.4

Using the recurrence (1) from the book, we have

CN+1CN=C(N+1)/2+C(N+1)/2+(N+1)CN/2CN/2N.C_{N+1}-C_N=C_{\lceil (N+1)/2 \rceil}+C_{\lfloor (N+1)/2 \rfloor}+(N+1)-C_{\lceil N/2 \rceil}-C_{\lfloor N/2 \rfloor}-N.

Since (N+1)/2=N/2\lfloor (N+1)/2 \rfloor=\lceil N/2 \rceil and (N+1)/2=N/2+1\lceil (N+1)/2 \rceil=\lfloor N/2 \rfloor +1 we can simplify the previous equation and get

CN+1CN=CN/2+1CN/2+1.C_{N+1}-C_N=C_{\lfloor N/2 \rfloor +1}-C_{\lfloor N/2 \rfloor}+1.

The quantity N/2\lfloor N/2 \rfloor is a positive integer, so we can apply recursively the above expansion by leveraging a useful property of floor and ceiling functionsarrow-up-right that x/2=x/2\lfloor\lfloor x\rfloor/2 \rfloor=\lfloor x/2 \rfloor. We can carry out this process until the problem size drops to 1, recalling that C1=0C_1=0. Thus, we get

CN+1CN=CN/2+1CN/2+1=CN/4+1CN/4+1+1=C2C1+1+1++1lgN times=2+lgN.\begin{align*} C_{N+1}-C_N &= C_{\lfloor N/2 \rfloor +1}-C_{\lfloor N/2 \rfloor}+1 \\ &= C_{\lfloor N/4 \rfloor +1}-C_{\lfloor N/4 \rfloor}+1+1 \\ &\dots \\ &=C_2-C_1+\underbrace{1+1+\dots+1}_{\lfloor \lg N \rfloor \text{ times}} \\ &= 2+\lfloor \lg N \rfloor. \end{align*}

Let’s express CNC_N as a sum of differences, so we have

CN=(CNCN1)+(CN1CN2)++(C2C1)=(2+lg(N1))+(2+lg(N2))++(2+lg1)=1k<N(lgk+2).\begin{align*} C_N &= (C_N-C_{N-1})+ (C_{N-1} - C_{N-2}) + \dots+ (C_2-C_1) \\ &=(2+\lfloor\lg (N-1)\rfloor)+(2+\lfloor\lg (N-2)\rfloor)+\dots+(2+\lfloor\lg 1\rfloor) \\ &=\sum_{1 \le k <N} (\lfloor\lg k\rfloor+2). \end{align*}

Exercise 1.5 🌟

circle-check

As mentioned in the book, the theorem for a general NN follows from (1) by induction. Let’s use the equation CN=NlgN+N2lgNC_N=N \lceil \lg N \rceil+N-2^{\lceil \lg N \rceil} as the inductive hypothesis. The base case C1=0C_1=0 trivially holds, since lg1=0\lg 1=0. For N>1N>1 we have

CN=N/2lgN/2+N/22lgN/2+N/2lgN/2+N/22lgN/2(by the inductive hypothesis)+N=N/2lgN/2+N/2lgN/2+2N2lgN/22lgN/2(N=N/2+N/2).\begin{align*} C_N&=\lceil N/2 \rceil \lceil \lg {\lceil N/2 \rceil} \rceil+\lceil N/2 \rceil-2^{\lceil \lg {\lceil N/2 \rceil} \rceil} \\ &\hspace{1cm} + \lfloor N/2 \rfloor \lceil \lg {\lfloor N/2 \rfloor} \rceil+\lfloor N/2 \rfloor-2^{\lceil \lg {\lfloor N/2 \rfloor} \rceil} && \text{(by the inductive hypothesis)} \\ &\hspace{1cm} +N \\ &= \lceil N/2 \rceil \lceil \lg {\lceil N/2 \rceil} \rceil+\lfloor N/2 \rfloor \lceil \lg {\lfloor N/2 \rfloor} \rceil+2N-2^{\lceil \lg {\lceil N/2 \rceil} \rceil}-2^{\lceil \lg {\lfloor N/2 \rfloor} \rceil} && \text{($N=\lceil N/2 \rceil+\lfloor N/2 \rfloor$)}. \end{align*}

If NN is even, then N/2=N/2=N/2\lceil N/2 \rceil=\lfloor N/2 \rfloor=N/2, so the last line simplifies to

CN=2(N/2)lg(N/2)+2N22lg(N/2)=NlgNN+2N2lgN(lg(N/2)=lgN1=lgN1)=NlgN+N2lgN.\begin{align*} C_N&=2 \cdot (N/2) \lceil \lg {(N/2)} \rceil+2N- 2\cdot 2^{\lceil \lg {(N/2) } \rceil} \\ &=N \lceil \lg N \rceil-N+2N-2^{\lceil \lg N \rceil} && \text{($\lceil \lg {(N/2)} \rceil=\lceil \lg N-1 \rceil=\lceil \lg N \rceil-1$)} \\ &=N \lceil \lg N \rceil+N-2^{\lceil \lg N \rceil}. && \checkmark \end{align*}

If NN is odd, then N/2=(N+1)/2\lceil N/2 \rceil=(N+1)/2 and N/2=(N1)/2\lfloor N/2 \rfloor=(N-1)/2, so we get

CN=N+12lgN+12+N12lgN12+2N2lgN+122lgN12=N+12lg(N+1)+N12lg(N1)N+2N2lg(N+1)12lg(N1)1=N+12lg(N+1)+N12lg(N1)+N2lg(N+1)12lg(N1)1.\begin{align*} C_N&=\frac{N+1}{2} \bigg\lceil \lg \frac{N+1}{2} \bigg\rceil+\frac{N-1}{2} \bigg\lceil \lg \frac{N-1}{2} \bigg\rceil+2N-2^{\big\lceil \lg \frac{N+1}{2} \big\rceil}-2^{\big\lceil \lg \frac{N-1}{2} \big\rceil} \\[0.4cm] &=\frac{N+1}{2} \lceil \lg (N+1) \rceil + \frac{N-1}{2} \lceil \lg (N-1) \rceil-N+2N-2^{\lceil \lg (N+1) \rceil-1}-2^{\lceil \lg (N-1) \rceil-1} \\[0.4cm] &=\frac{N+1}{2} \lceil \lg (N+1) \rceil + \frac{N-1}{2} \lceil \lg (N-1) \rceil+N-2^{\lceil \lg (N +1)\rceil-1}-2^{\lceil \lg (N-1) \rceil-1}. \end{align*}

We must figure out how to express lg(N+1)\lceil \lg (N+1) \rceil and lg(N1)\lceil \lg (N-1) \rceil. We know that N(2k,2k+1)N \in (2^k,2^{k+1}) for some positive integer kk. Observe thatlg(N+1)=lgN=k+1\lceil \lg (N+1) \rceil=\lceil \lg N\rceil=k+1. If N>2k+1N>2^k+1 thenlg(N1)=lgN=k+1\lceil \lg (N-1) \rceil=\lceil \lg N \rceil=k+1, otherwise lg(N1)=lgN1=k\lceil \lg (N-1) \rceil=\lceil \lg N \rceil-1=k. Therefore, we must differentiate these two sub-cases.

When N>2k+1N>2^k+1 we immediately get

CN=N+12lgN+N12lgN+N2lgN12lgN1=NlgN+N22lgN1=NlgN+N2lgN.    \begin{align*} C_N &=\frac{N+1}{2} \lceil \lg N \rceil + \frac{N-1}{2} \lceil \lg N \rceil+N-2^{\lceil \lg N \rceil-1}-2^{\lceil \lg N \rceil-1} \\ &=N \lceil \lg N \rceil +N-2 \cdot 2^{\lceil \lg N \rceil-1} \\ &= N \lceil \lg N \rceil +N-2^{\lceil \lg N \rceil}. \space \space\space\space\checkmark \end{align*}

If N=2k+1N=2^k+1 observe that (N1)/2=(2k+11)/2=2k1=2lgN2(N-1)/2=(2^k+1-1)/2=2^{k-1}=2^{\lceil \lg N \rceil-2}. Thus, we have

CN=N+12lgN+N12lgN2lgN2+N2lgN12lgN2=NlgN+N2lgN122lgN2=NlgN+N22lgN1=NlgN+N2lgN.    \begin{align*} C_N &=\frac{N+1}{2} \lceil \lg N \rceil + \frac{N-1}{2} \lceil \lg N \rceil-2^{\lceil \lg N \rceil-2}+N-2^{\lceil \lg N \rceil-1}-2^{\lceil \lg N \rceil-2} \\ &=N \lceil \lg N \rceil +N-2^{\lceil \lg N \rceil-1}-2 \cdot 2^{\lceil \lg N \rceil-2} \\ &=N \lceil \lg N \rceil +N-2 \cdot 2^{\lceil \lg N \rceil-1} \\ &=N \lceil \lg N \rceil +N-2^{\lceil \lg N \rceil}. \space \space\space\space\checkmark \end{align*}

All in all, we’ve proved that the inductive hypothesis is true for all N1N \ge1.

Exercise 1.6

We use the following three-way merge procedure in Java instead of the original from Exercise 1.3. It uses sentinels, so counting the number of comparisons inside the loop is straightforward.

Each iteration of the main for loop performs 2 comparisons to decipher the minimum among the 3 candidates. The recurrence is

CN=3CN/3+2N  for N2 with C1=C0=0.C_N=3C_{N/3 }+2N \space\space \text{for $N \ge 2$ with $C_1=C_0=0$.}

If we consider the case when NN is a power of 3, and use the same analysis as in the book for the standard mergesort, then we get CN=2Nlog3NC_N=2N\log_3 N.

Exercise 1.7 🌟

circle-check

According to Exercise 1.1, we know that tN=cNlgN+dNcNlgNt_N=cN \lg N+dN \sim cN \lg N. We can estimate t2Nt_{2N}, independent of the machine, as follows:

t2N2cNlg2N=2cN+2cNlgN2cN+2tN2tN(1+1/lgN)(cNtN/lgN).\begin{align*} t_{2N} &\approx 2cN\lg {2N} \\ &=2cN+2cN\lg N \\ &\approx2cN+2t_N \\ &\approx2t_N(1+1/\lg N) && \text{($cN\approx t_N/\lg N$)}. \end{align*}

The next Python script shows how well our estimated t2Nt_{2N} reflects the actual t2Nt_{2N}. We can run different experiments by altering the "machine-dependent" constants cc and dd.

The output shows that the relative error drops quickly as NN increases.

Exercise 1.8

Let’s use the Python implementation of mergesortarrow-up-right from GeeksforGeeks and replace the driver code with the following snippet:

Here’s an output on a MacBook Pro 2019:

Exercise 1.9

The program Sort2distinct.javaarrow-up-right implements this algorithm in Java.

triangle-exclamation

Exercise 1.10

Perform an initial linear scan to record one set of indices ii, jj and kk for those three distinct values. Determine the index of the median and exchange it with an element at the beginning of an array. Finally, run the 3-way partitioning procedure from Exercise 1.9.

circle-info

As an additional practice, solve the LeetCode problem Sort Colorsarrow-up-right.

Exercise 1.11 🌟

circle-check

The following Python program implements this concept. The merge procedure uses sentinels to be aligned with the version from the book.

The running time is shown in microseconds. The ratio tN/cNt_N/c_N stabilizes, as the problem size increases, although there are some fluctuations. Here is the output generated on a MacBook Air 2025 computer:

On a MacBook Pro 2019 machine, the constant approached 0.30.3.

Exercise 1.12

Let TNT_N denote the cumulative (total) number of compares used by quicksort on all N!N! permutations of NN elements. The average number of comparisons is CN=TN/N!C_N=T_N/N!, so the base cases are T2=6T_2=6 and T1=T0=0T_1=T_0=0. We can derive TNT_N based upon the simplified recurrence of CNC_N from the book for N>2N>2.

NCN=(N+1)CN1+2NN(TNN!)=(N+1)(TN1(N1)!)+2NTN(N1)!=(N+1)TN1(N1)!+2NTN=(N+1)TN1+2N!.\begin{align*} NC_N&=(N+1)C_{N-1}+2N \\[0.2cm] N\left(\frac{T_N}{N!}\right) &= (N+1)\left(\frac{T_{N-1}}{(N-1)!}\right)+2N \\[0.4cm] \frac{T_N}{(N-1)!} &= (N+1)\frac{T_{N-1}}{(N-1)!}+2N \\[0.4cm] &\therefore \boxed{T_N = (N+1)T_{N-1}+2N!}. \end{align*}

Exercise 1.13 🌟

circle-check

The proof below is adopted from Sedgewick’s PhD thesisarrow-up-right.

Suppose that a permutation of {1,2,,N}\{1,2,\dots,N\} is being partitioned with all such N!N! permutations equally likely. Choose a pivot 1vN1 \le v \le N, so that after partitioning the size of the left subarray becomes v1v-1. Any rearrangement of these v1v-1 elements, out of the possible (v1)!(v-1)!, in the original permutation is mapped to a unique permutation in the left subarray. In other words, we can establish a bijectionarrow-up-right between these two objects. Since all the original permutations are equally likely, the same is true for permutations in the left subarray. The analogous reasoning applies, independently, for the right subarray. This concludes the proof.

Moving the pivot during the partitioning process, by setting j:=hi+1j:=hi+1, introduces bias into the subarrays (unless the pivot already happens to be in its final position). The probability of having a permutation where the pivot remains in its initial position is quite low. In our case, the first element from the left larger than the pivot will trigger it to move; it tends to fall near the beginning of the left subarray. Therefore, not all positions are equally likely to contain vv.

circle-exclamation

Exercise 1.14

We follow the technique from the book to eliminate summation on the RHS. We have

AN=1+2N1jNAj1(for N>0)NAN=N+21jNAj1NAN(N1)AN1=1+2AN1(for N>1, after eliminating summation)NAN=(N+1)AN1+1ANN+1=AN1N+1(N+1)NAN+1N+1=AN1+1N(1(N+1)N=1N1N+1)AN+1N+1=AN1+1N==A1+12(iterating)AN=N.\begin{align*} A_N&=1+\frac{2}{N}\sum_{1\le j \le N}A_{j-1} && \text{(for $N>0$)} \\ NA_N&=N+2\sum_{1\le j \le N}A_{j-1} \\ NA_N-(N-1)A_{N-1}&=1+2A_{N-1} && \text{(for $N>1$, after eliminating summation)} \\ NA_N&=(N+1)A_{N-1}+1 \\ \frac{A_N}{N+1}&=\frac{A_{N-1}}{N}+\frac{1}{(N+1)N} \\ \frac{A_N+1}{N+1}&=\frac{A_{N-1}+1}{N} && \text{$\left(\frac{1}{(N+1)N}=\frac{1}{N}-\frac{1}{N+1}\right)$} \\ \frac{A_N+1}{N+1}&=\frac{A_{N-1}+1}{N}=\dots=\frac{A_1+1}{2} && \text{(iterating)} \\ &\therefore \boxed{A_N=N}. \end{align*}

Exercise 1.15 🌟

circle-check

The proof is adopted from Sedgewick’s PhD thesisarrow-up-right.

WLOG assume that the input array is a random permutation of the set {1,2,,N}\{1,2,\dots,N\}. The base cases are B1=B0=0B_1=B_0=0 and the recurrence is

BN=1N1vN{average number of exchanges when the pivot is v}+2N1jNBj1(for N>1).\begin{align*} B_N=&\frac{1}{N}\sum_{1 \le v\le N} \bigg\{\text{average number of exchanges when the pivot is $v$}\bigg \} \\ &+\frac{2}{N}\sum_{1\le j \le N}B_{j-1} && \text{(for $N>1$)}. \end{align*}

Let’s analyze the first term of the sum by looking at Program 1.2 from the book. Quicksort is parsimonious regarding exchanges; only those elements are swapped that must be swapped. If the pivot is v=a[N]v=a[N], then only those keys among a[v],,a[N1]a[v],\dots,a[N-1] which are smaller than vv are exchanged into the right subarray. Furthermore, each element is exchanged at most once during the whole partitioning process. Let XvX_v be the random variable denoting the number of exchanges used during the first partitioning stage when the pivot is vv. The probability mass function of XvX_v is given by

Pr{Xv=k}=(NvNvk)(v1k)(N1Nv)=(Nvk)(v1v1k)(N1v1).\Pr\{X_v=k\}=\frac{\dbinom{N-v}{N-v-k}\dbinom{v-1}{k}}{\dbinom{N-1}{N-v}}=\frac{\dbinom{N-v}{k}\dbinom{v-1}{v-1-k}}{\dbinom{N-1}{v-1}}.

The numerator expresses in how many ways we can select kk smaller elements from the left subarray and use them to form the right subarray. The denominator is just the total number of possible combinations to form the right subarray. As a side note, the right side of the equation follows from applying the identity (nn)=(nnk)\binom{n}{n}=\binom{n}{n-k} and shows symmetry in the problem, i.e., it describes the case when a[1]a[1] is chosen as the pivot. At any rate, k[0,v1]k \in [0,v-1].

The average number of exchanges is the expectation of XvX_v

E[Xv]=k=0v1kPr{Xv=k}=k=1v1k(Nvk)(v1v1k)(N1v1)=Nv(N1v1)k=1v1(Nv1k1)(v1v1k)=Nv(N1v1)k=0v2(Nv1k)(v1v2k)=Nv(N1v1)(N2v2)(Vandermonde convolution)=(Nv)(v1)N1.\begin{align*} \mathbb{E}{[X_v]} &= \sum_{k=0}^{v-1} k\Pr\{X_v=k\} \\[0.5cm] &= \sum_{k=1}^{v-1} {k \cdot \frac{\dbinom{N-v}{k}\dbinom{v-1}{v-1-k}}{\dbinom{N-1}{v-1}} }\\[0.8cm] &= \frac{N-v}{\dbinom{N-1}{v-1}} \sum_{k=1}^{v-1} \binom{N-v-1}{k-1}\binom{v-1}{v-1-k} \\[0.8cm] &= \frac{N-v}{\dbinom{N-1}{v-1}} \sum_{k=0}^{v-2} \binom{N-v-1}{k}\binom{v-1}{v-2-k} \\[0.8cm] &= \frac{N-v}{\dbinom{N-1}{v-1}} \binom{N-2}{v-2} && \text{(Vandermonde convolution)} \\[0.8cm] &= \frac{(N-v)(v-1)}{N-1}. \end{align*}

This gives

BN=1N1vNE[Xv]+2N1jNBj1=1N(N1)1vN(Nv)(v1)+2N1jNBj1=1N(N1)(N3)+2N1jNBj1=N26+2N1jNBj1.\begin{align*} B_N=&\frac{1}{N}\sum_{1 \le v\le N} \mathbb{E}{[X_v]}+\frac{2}{N}\sum_{1\le j \le N}B_{j-1} \\ =&\frac{1}{N(N-1)}\sum_{1 \le v\le N} (N-v)(v-1)+\frac{2}{N}\sum_{1\le j \le N}B_{j-1} \\ &= \frac{1}{N(N-1)}\binom{N}{3}+\frac{2}{N}\sum_{1\le j \le N}B_{j-1} \\ &= \bold{\frac{N-2}{6}}+\frac{2}{N}\sum_{1\le j \le N}B_{j-1}. \end{align*}
circle-exclamation

Exercise 1.16 🌟

circle-check

Let BNkB_{Nk} denote the number of subarrays of size k>0k>0 encountered, on the average, when sorting a random array of size NN with quicksort. We have 3 cases:

  1. If N<kN<k then we can’t have a subarray of size kk, so BNk=0B_{Nk}=0.

  2. If N=kN=k then BNk=1B_{Nk}=1, since only the initial array is of size kk.

  3. If N>kN>k then the initial call is on a subarray whose size is larger than kk, hence we don’t count it. We average the counts from the resulting subproblems. The recurrence is

BNk=2N1jNB(j1)k.B_{Nk}=\frac{2}{N}\sum_{1\le j \le N}B_{(j-1)k}.

Fast forwarding the exact same steps as in Exercise 1.14, we get

BNkN+1=B(N1)kN==B(k+1)kk+2    (for N>k+1).\frac{B_{Nk}}{N+1}=\frac{B_{(N-1)k}}{N}=\dots=\frac{B_{(k+1)k}}{k+2} \space\space\space\space \text{(for $N>k+1$)}.

Observe that we can’t iterate further than B(k+1)kB_{(k+1)k}. Nonetheless, we can calculate it using the main recurrence to get B(k+1)k=2/(k+1)B_{(k+1)k}=2/(k+1); all base cases are zeros except Bkk=1B_{kk}=1. Therefore, we have

BNk=2(N+1)(k+1)(k+2)    (for N>k).B_{Nk}=\frac{2(N+1)}{(k+1)(k+2)}\space\space\space\space \text{(for $N>k$)}.
circle-info

A simple variation of this problem is the review question Q1.3arrow-up-right; the answer is

BNk=k(N+1)k+2(for N>k).B_{Nk}=\cfrac{k(N+1)}{k+2}\qquad \text{(for $N>k$)}.

Exercise 1.17

Fast forwarding the exact same steps as in the proof of Theorem 1.3, we get

CNN+1=CM+1M+2+2M+3kN+11k(for N>M+1)=CM+1M+2+2(HN+1HM+2).\begin{align*} \frac{C_N}{N+1}&=\frac{C_{M+1}}{M+2}+2\sum_{M+3 \le k \le N+1} \frac{1}{k} && \text{(for $N>M+1$)}\\ &=\frac{C_{M+1}}{M+2}+2(H_{N+1}-H_{M+2}). \end{align*}

We can calculate CM+1C_{M+1} using the main recurrence

CM+1=(M+1)+1+2M+11jM+1Cj1=M+2+2M+10jMj(j1)4(reindexing and substituting Cj)=M+2+12(M+1)1jMj(j1)=M+2+12(M+1)(M(M+1)(2M+1)6M(M+1)2)=M+2+M(M1)6.\begin{align*} C_{M+1} &= (M+1)+1+\frac{2}{M+1}\sum_{1 \le j \le M+1} C_{j-1} \\ &= M+2+\frac{2}{M+1}\sum_{0 \le j \le M} \frac{j(j-1)}{4} && \text{(reindexing and substituting $C_j$)} \\ &= M+2+\frac{1}{2(M+1)}\sum_{1 \le j \le M} j(j-1) \\ &= M+2+\frac{1}{2(M+1)}\left(\frac{M(M+1)(2M+1)}{6}-\frac{M(M+1)}{2}\right) \\ &= M+2+\frac{M(M-1)}{6}. \end{align*}

The formula for the average number of compares in quicksort, when using insertion sort to handle small subarrays, is

CN=(N+1)(1+M(M1)6(M+2))+2(N+1)(HN+1HM+2)    (for N>M).C_N=(N+1)\left(1+\frac{M(M-1)}{6(M+2)}\right)+2(N+1)(H_{N+1}-H_{M+2}) \space\space\space\space \text{(for $N>M$)}.
circle-exclamation

Exercise 1.18 🌟

circle-check

After ignoring small terms (those significantly less than NN) as well as constant multiples of NN (they don’t influence the minimization task), we can arrive at the following approximation

CN2NlnN+(M(M1)6(M+2)2lnM)=f(M)N.C_N\approx 2N\ln N +\underset{=f(M)}{\undergroup{\left(\frac{M(M-1)}{6(M+2)}-2\ln {M}\right)}}N.

We can use a suitable math tool, like WolframAlphaarrow-up-right, to solve the minimization problem as well as plot the graph of f(M)f(M). Issue plot m(m-1)/(6(m+2))-2ln(m) for 1<=m<60 to see its shape.

Plot of f(M)f(M) that shows a global minimum around 12.

You can find the minimum by saying m(m-1)/(6(m+2))-2ln(m) minimize for m>1 . It reports a global minimum at M12.4M \approx 12.4.

circle-info

You may use a different approximation, like using lg(M+2)\lg (M+2) instead of lgM\lg M inside f(M)f(M), but the outcome would be similar. Also, you may want to try the more accurate expression for CNC_N (see the remark in the previous exercise).

Exercise 1.19

According to the graph from the previous exercise, the threshold is M=49M =49.

Exercise 1.20

Below is the Python script that reports how many keys in a file of N=104N=10^4 random positive integers less than M=106M=10^6 are likely to be equal to some other keys in the file.

The output shows an excellent alignment between empirical evidence and analytic result:

The analytic formula is the same as for the number of people with a shared birthdayarrow-up-right; replace the number of days dd with MM.

circle-info

The program sets the seed of the random number generator. This allows repeatable results even for randomly generated numbers.

Exercise 1.21

Try changing the relative magnitudes of NN and MM in the previous exercise to see how the number of duplicates increase as MM drops, especially when MNM \ll N. The random permutation model assumes that all elements are distinct and all possible N!N! permutations are equally likely to occur. Nevertheless, even in the presence of many duplicates, Program 1.2 retains good performance. As emphasized in Sedgewick’s paper Quicksort with Equal Keysarrow-up-right., the evidence in favor of stopping the scanning pointers on keys equal to the partitioning element in quicksort is overwhelming. This is exactly how partitioning works in Program 1.2.

There are also other schemes to efficiently handle duplicates. The Algorithms, 4th Edition by RS, KW book’s websitearrow-up-right, has an extensive treatment of quicksort, including variants like fast 3-way and dual-pivot. There’s also background information about entropy optimal sorting, which states the following

Arrays with large numbers of duplicate sort keys arise frequently in applications. In such applications, there’s potential to reduce the time of the sort from linearithmic to linear.

Thus, this is one example where the random permutation model is inaccurate. Of course, the story can go the other way around when partitioning doesn’t account for duplicates. In this case, quicksort may show quadratic runtime.

Exercise 1.22

Essentially, the distribution of keys is "asymptotically" immaterial for mergesort, as it achieves Θ(NlgN)\Theta(N\lg N) runtime in all cases. The sorting numbersarrow-up-right specify the worst-case number of comparisons made by mergesort (see also Exercise 1.5). The best case takes about half as many iterations as its worst case; this doesn’t apply to Program 1.1 (see Exercise 1.23). All in all, something like Table 1.1 isn’t useful to get insight into stochastic fluctuations.

On the other hand, Table 1.1 could validate a model, implementation and/or analysis. It could help in revealing machine-dependent constants in a model (see Exercise 1.11) or pinpoint awkward implementation-related issues, perhaps bugs, if the empirical evidence goes against mathematical predictions. Finally, it may assure us that the model for a new variant of mergesort aligns well with empirical data.

Exercise 1.23

The running time of mergesort as implemented in Program 1.1 depends only on the number of elements in the array being sorted, not on their arrangement. Therefore, the standard deviation in the number of compares is zero, since the input size fully determines the cost.

Last updated