# Chapter 7

## Exercise 7.1-1

Coloring is similar as in the book, whilst indexing is the same.

13  19  9  5  12  8  7  4  21  2  6  <mark style="background-color:yellow;">11</mark>\ <mark style="background-color:blue;">13</mark>  19  9  5  12  8  7  4  21  2  6  <mark style="background-color:yellow;">11</mark>\ <mark style="background-color:blue;">13  19</mark>  9  5  12  8  7  4  21  2  6  <mark style="background-color:yellow;">11</mark>\ <mark style="background-color:red;">9</mark>  <mark style="background-color:blue;">19  13</mark>  5  12  8  7  4  21  2  6  <mark style="background-color:yellow;">11</mark>\ <mark style="background-color:$danger;">9  5</mark> <mark style="background-color:blue;">13  19</mark>  12  8  7  4  21  2  6  <mark style="background-color:yellow;">11</mark>\ <mark style="background-color:red;">9  5</mark>  <mark style="background-color:blue;">13  19  12</mark>  8  7  4  21  2  6  <mark style="background-color:yellow;">11</mark>\ <mark style="background-color:red;">9  5  8</mark>  <mark style="background-color:blue;">19  12  13</mark>  7  4  21  2  6  <mark style="background-color:yellow;">11</mark>\ <mark style="background-color:red;">9  5  8  7</mark>  <mark style="background-color:blue;">12  13  19</mark>  4  21  2  6  <mark style="background-color:yellow;">11</mark>\ <mark style="background-color:red;">9  5  8  7  4</mark>  <mark style="background-color:blue;">13  19  12</mark>  21  2  6  <mark style="background-color:yellow;">11</mark>\ <mark style="background-color:red;">9  5  8  7  4</mark>  <mark style="background-color:blue;">13  19  12  21</mark>  2  6  <mark style="background-color:yellow;">11</mark>\ <mark style="background-color:red;">9  5  8  7  4  2</mark>  <mark style="background-color:blue;">19  12  21  13</mark>  6  <mark style="background-color:yellow;">11</mark>\ <mark style="background-color:red;">9  5  8  7  4  2  6</mark>  <mark style="background-color:blue;">12  21  13  19</mark>  <mark style="background-color:yellow;">11</mark>\ <mark style="background-color:red;">9  5  8  7  4  2  6</mark>  <mark style="background-color:yellow;">11</mark>  <mark style="background-color:blue;">21  13  19  12</mark>

## Exercise 7.1-2

**Available in the latest revision of the IM.**

## Exercise 7.1-3

**Available in the latest revision of the IM.**

## Exercise 7.1-4

**Available in the latest revision of the IM.**

## Exercise 7.2-1

**Available in the latest revision of the IM.**

## Exercise 7.2-2

**Available in the latest revision of the IM.**

{% hint style="info" %}
If we use a partitioning scheme introduced in [Exercise 7.1-2](#exercise-7.1-2), then the running time is $$\Theta(n \lg n)$$.
{% endhint %}

## Exercise 7.2-3

**Available in the latest revision of the IM.**

## Exercise 7.2-4

**Available in the latest revision of the IM.**

{% hint style="warning" %}
Nearly sorted input will push QuickSort toward the worst-case scenario, rather than the best-case, as hinted in the IM.
{% endhint %}

## Exercise 7.2-5

**Available in the latest revision of the IM.**

## Exercise 7.2-6

**Available in the latest revision of the IM.**

{% hint style="info" %}
In this exercise, we are asked to derive an approximate probability. Thus, we can ignore floors, ceilings and the pivot in calculations. The analysis in the IM follows this assumption.
{% endhint %}

## Exercise 7.3-1

**Available in the latest revision of the IM.**

## Exercise 7.3-2

**Available in the latest revision of the IM.**

{% hint style="warning" %}
The analysis of the best-case in the IM is a bit inaccurate. To sort an array of $$n$$ elements, `Randomized-QuickSort` must find the matching permutation of elements, that is, find their sorted positions. Each call to `Randomized-Partition` fixes the pivot, so leaves are singleton calls to `Randomized-QuickSort`. In the best-case, the recursion tree is as balanced as possible at every level. The number of pivots matches non-singleton subarrays and equals $$\lceil n/2 \lceil$$ with most balanced splits, hence we have $$\lfloor n/2 \rfloor$$ leaves. Despite ending with the same conclusion, as in the IM, of $$\Theta(n)$$ calls to `Random`, it is important to be accurate in analysis.
{% endhint %}

## Exercise 7.4-1

We guess that $$T(n) \ge cn^2$$ for some constant $$c >0$$. Substituting this guess into recurrence (7.1) yields

$$
\begin{align\*}
T(n)&\ge\max{cq^2+c(n-1-q)^2:0\le q\le n-1}+\Theta(n) \\
&=c\max{q^2+(n-1-q)^2:0\le q\le n-1}+\Theta(n)
\end{align\*}
$$

Every term in the maximization is bounded by $$(n-1)^2$$, when $$q=0 \lor q=n-1$$ (see [Exercise 7.4-3](#exercise-7.4-3)), as explained in the book. Continuing our analysis of $$T(n)$$, we obtain

$$
\begin{align\*}
T(n) &\ge c(n-1)^2+\Theta(n) \\
&=cn^2-c(2n-1)+\Theta(n) \\
&\ge cn^2
\end{align\*}
$$

by picking the constant $$c$$ small enough that the $$\Theta(n)$$ term dominates. Therefore, $$T(n)=\Omega(n^2)$$.

## Exercise 7.4-2

**Available in the latest revision of the IM.**

## Exercise 7.4-3

A quadratic polynomial of $$q$$ can have at most two distinct zeros. As explained in the book, $$q \le n-1$$ implies that $$2q(q-(n-1)) \le 0$$. When this expression is zero we have a maximum, over all values of $$q$$, for $$q^2+(n-1-q)^2$$.  Apparently, the two distinct zeros are $$q= 0 \lor q=n-1$$.

## Exercise 7.4-4

$$
\begin{align\*}
\mathbb{E}{\[X]} &= \sum\_{i=1}^{n-1} \sum\_{j=i+1}^n \frac{2}{j-i+1} \\
&= \sum\_{i=1}^{n-1} \sum\_{k=1}^{n-i} \frac{2}{k+1} \\
&= 2\sum\_{i=1}^{n-1} \left(\sum\_{k=1}^{n-i+1} \frac{1}{k}-1\right) \\
&\ge 2\sum\_{i=1}^{n-1} \left(\ln {(n-i+2)} -1\right) && \text{(by equation (A.20))} \\
&= 2\left(\sum\_{i=3}^{n+1} \ln i - (n-1)\right) \\
&= 2 \left(\ln \prod\_{i=3}^{n+1} i-(n-1)\right) \\
&\ge 2\ln \prod\_{i=1}^n i-2n && \text{(for sufficiently large $n$)}  \\
&= 2\ln n! -2n \\
&= \Theta(n\lg n) && \text{(by equation (3.28))}
\end{align\*}
$$

This bound and Lemma 7.1 allow us to conclude that the expected running time of `Randomized-Quicksort` is $$\Omega(n\lg n)$$.

## Exercise 7.4-5

Insertion sort requires $$O(k^2)$$ time to sort an array of size $$k$$. The total time spent by insertion sort is $$O(n/k) \cdot O(k^2)=O(nk)$$, since we can have at most $$n/k$$ subarrays of size $$k$$. The expected number of levels in the recursion tree is $$O(\lg (n/k))$$. Therefore, the total time spent in partitioning subarrays is $$O(n\lg(n/k))$$. After summing up these terms, we get that the randomized version of this sorting algorithm runs in $$O(nk+n\lg (n/k))$$ expected time.

The modified algorithm has the same asymptotic running time as standard randomized quick-sort when $$O(nk+n\lg (n/k))=O(n\lg n)$$. The largest asymptotic value of $$k$$, as a function of $$n$$, that satisfies this condition is $$k=\Theta(\lg n)$$. Notice that $$k$$ cannot be asymptotically larger than $$\Theta(\lg n)$$, otherwise $$nk=\omega(n\lg n)$$ and we would make things worse. On the other hand, plugging in $$k=\lg n$$ into our formula for the expected running time gives

$$
O(n\lg n+n\lg (n/\lg n))=O(2n\lg n-n\lg \lg n)=O(n\lg n) \space\checkmark
$$

In practice, you would need to experimentally find the best $$k$$, as performance depends on many other machine related factors. Usually, you would want to select the largest possible $$k$$ among all values that give proper performance advantages.

## Exercise 7.4-6

If the median falls into the lowest or largest $$\alpha n$$ numbers, then we get a worse than $$\alpha$$-to-$$(1-\alpha)$$ split. This happens if at least two randomly selected numbers belong in these ranges. We have the following 4 disjoint events:

1. All three numbers are among the lowest $$\alpha n$$ values.
2. All three numbers are among the largest $$\alpha n$$ values.
3. Exactly two numbers are among the lowest $$\alpha n$$ values.
4. Exactly two numbers are among the largest $$\alpha n$$ values.

Therefore, the total probability is

$$
p=\alpha^3+\alpha^3+\binom{3}{2}\alpha^2(1-\alpha)+\binom{3}{2}\alpha^2(1-\alpha)=2\alpha^2(3-2\alpha)
$$

## Problem 7-1

### a.

Initialization:\
A=(13, 19, 9, 5, 12, 8, 7, 4, 11, 2, 6, 21)\
p=1\
r=12\
x=13\
i=0\
j=13

After Iteration 1:\
A=(6, 19, 9, 5, 12, 8, 7, 4, 11, 2, 13, 21)\
i=1\
j=11

After Iteration 2:\
A=(6, 2, 9, 5, 12, 8, 7, 4, 11, 19, 13, 21)\
i=2\
j=10

After Iteration 3:\
A=(6, 2, 9, 5, 12, 8, 7, 4, 11, 19, 13, 21)\
i=10\
j=9\
RETURN 9

### b.

`Hoare-Partition` produces a balanced split instead of hitting the worst-case scenario, as is the case with `Partition`. It better distributes repeated elements equal to pivot, so it works well even in presence of duplicates.

### c.

We focus on the first iteration and show what happens to indices $$i$$ and $$j$$ as well as what transformation of $$A$$ ensures that none of these indices ever fall out of range. Initially, before entering the `while` loop, we have $$x=A\[p]$$, $$i=p-1$$ and $$j=r+1$$. The body of the loop runs as follows:

1. Index $$j$$ is decremented before reading $$A$$, so $$p \le j \le r$$ on first access.
2. It will be decremented until stumbling across an element not greater than $$x$$. Notice that $$j$$ cannot pass $$x$$, hence we know that $$p\le j$$ after line 7.
3. Index $$i$$ is incremented before reading $$A$$, so $$p \le i\le r$$ on first access.
4. It will be incremented until stumbling across an element not smaller than $$x$$. Notice that $$i$$ cannot pass $$x$$, hence we know that $$i=p$$ after line 10.

At this point, we have two possibilities at line 11:

1. If $$i=j$$, then the function returns $$p$$. This occurs when all other elements are greater than $$x$$.
2. Otherwise, we must have $$i\<j$$, so $$A\[i]$$ and $$A\[j]$$ are swapped. But exactly these values, now on opposite positions relative to indices $$i$$ and $$j$$, guard $$i$$ and $$j$$ from passing them. Furthermore, if either $$i$$ or $$j$$ (or both) stop at these locations, then we would surely have $$i > j$$ and the function would return $$j$$.

### d.

Again, we focus our attention on the first iteration. From part (c) we know that $$p \le j$$ at termination. Before accessing $$A$$ index $$j$$ is decremented, so $$j \le r$$. Now, based on the fact that $$p\<r$$, we have two scenarios:

1. If $$A\[p] \ge A\[r]$$, then these values are exchanged, since at line 11 $$i=p$$ and $$j=r$$. In the next iteration, $$j$$ will be decremented at least once.
2. Otherwise, $$j$$ is immediately decremented below $$r$$.

At any rate, at termination we have that $$p \le j < r$$.

### e.

Let's prove the next invariant:

At the beginning of the `while` loop at lines 4-13, $$i\<j$$, the subarray $$A\[p:i]$$ contains elements less than or equal to the pivot, and the subarray $$A\[j:r]$$ contains elements greater than or equal to the pivot.

#### Initialization

Before entering the loop for the first time, we have $$x=A\[p]$$, $$i=p-1$$ and $$j=r+1$$. The empty subarrays $$A\[p:i]$$ and $$A\[j:r]$$ satisfy the invariant and $$i\<j$$.

#### Maintenance

The `repeat-until` loop at lines 5-7 decreases $$j$$ up to some position $$j'\<j$$. We know that every element of $$A\[j'+1:j]$$ is greater than or equal to $$x$$. So, $$A\[j'+1:j] \cup A\[j:r]=A\[j'+1:r]$$ has only elements greater than or equal to $$x$$.

The `repeat-until` loop at lines 8-10 increases $$i$$ up to some position $$i'>i$$. We know that every element of $$A\[i:i'-1]$$ is less than or equal to $$x$$. So, $$A\[p:i] \cup A\[i:i'-1] =A\[p:i'-1]$$ has only elements less than or equal to $$x$$.

**Postcondition**: Let $$i=i'$$ and $$j=j'$$. After line 10, but before entering line 11, we have that the subarray $$A\[p:i-1]$$ contains elements less than or equal to the pivot, and the subarray $$A\[j+1:r]$$ contains elements greater than or equal to the pivot. Furthermore, $$A\[i] \ge x$$ and $$A\[j] \le x$$.

If $$i\<j$$ at line 11, then $$A\[i]$$ and $$A\[j]$$ are exchanged, so that the subarray $$A\[p:i]$$ contains elements less than or equal to the pivot and the subarray $$A\[j:r]$$ contains elements greater than or equal to the pivot. Hence, the invariant is reestablished for the next iteration.

#### Termination

The `while` loop must eventually terminate, since both indices are moving closer to each other at every iteration. This happens when $$i\ge j$$ at line 11. Nonetheless, no elements are exchanged, so the postcondition stated in the previous section remains true. We can differentiate the following cases:

1. If $$i=j$$, then $$A\[i]=A\[j]$$, thus the subarray $$A\[p:j]$$ contains elements less than or equal to the pivot and the subarray $$A\[j+1:r]$$ contains elements greater than or equal to the pivot.
2. Otherwise, $$i>j \implies i-1 \ge j$$, so the above conclusion again follows.

In both cases, It is true the every element from $$A\[p:j]$$ is less than or equal to every element from $$A\[j+1:r]$$.

### f.

```
Quicksort(A, p, r)
    if p < r
        q = Hoare-Partition(A, p, r)
        Quicksort(A, p, q)
        Quicksort(A, q + 1, r)
```

## Problem 7-2

**Available in the latest revision of the IM.**

{% hint style="danger" %}
There is an omission in the IM for part (c). Randomized-Partition' also differs at line 2, where $$A\[i]$$ must be exchanged with $$A\[p]$$ instead of $$A\[r]$$.
{% endhint %}

{% hint style="warning" %}
In part (d) the IM does not cover the best-case scenario. Namely, when all element values are equal, then `Quicksort'` runs in linear time.
{% endhint %}

## Problem 7-3

**Available in the latest revision of the IM.**

## Problem 7-4

**Available in the latest revision of the IM.**

## Problem 7-5

**Available in the latest revision of the IM.**

## Problem 7-6

### a.

The three randomly selected elements must include $$z\_i$$ itself, one smaller than $$z\_i$$, and one larger than $$z\_i$$. These elements can be chosen in any order, so we should count the number of 3-sets satisfying these constraints among $$\binom{n}{3}$$ such sets. We have $$p\_i=\cfrac{6(i-1)(n-i)}{n(n-1)(n-2)}$$.

### b.

We need to find $$\cfrac{p\_{\lfloor(n+1)/2 \rfloor}}{1/n}$$ at the limit, when $$n \to \infin$$. Observe that $$i=\lfloor(n+1)/2 \rfloor$$ tends to $$n/2$$ as $$n$$ increases. Therefore, we have

$$
\lim\_{n\to. \infin} \cfrac{\cfrac{6(n/2-1)(n-n/2)}{n(n-1)(n-2)}}{1/n}=\lim\_{n\to. \infin}  \cfrac{\cfrac{3}{2}(n^2-2n)}{n^2-3n+2}=\cfrac{3}{2}
$$

### c.

For convenience, suppose $$n$$ is a multiple of 3. We need to find $$\cfrac{\sum\_{i=n/3}^{2n/3} p\_i}{1/3}$$ at the limit, when $$n \to \infin$$. Following the hint from the book, we approximate the sum by an integral.

$$
\begin{align\*}
\sum\_{i=n/3}^{2n/3} p\_i &\approx \int\_{n/3}^{2n/3} \frac{6(-x^2+nx+x-n)}{n(n-1)(n-2)}dx \\
&= \frac{6(-7n^3/81+3n^3/18+\Theta(n^2))}{n(n-1)(n-2)} \\
&\to \frac{13}{27} && \text{(when $n \to \infin$)}
\end{align\*}
$$

Thus, the median-of-3 method increases the likelihood of getting a good split compared with the ordinary implementation by 13/9.

### d.

The $$\Omega(n \lg n)$$ running time of quicksort is still valid even if the algorithm chooses the best pivot (median of the subarray) in every partitioning. Thus, any technique that improves the chances of having good splits cannot influence the asymptotic lower bound of quicksort.

## Problem 7-7

### a.

The `Fuzzy-Partition` function is similar to `Partition'` from [Problem 7-2](#problem-7-2), part (b). The rest of the code of randomized quicksort uses this partitioning function instead of the ordinary version.

```
Fuzzy-Partition(A, p, r)
    x = A[p]    // Pivot is chosen to be at the beginning of A[p:r]. 
    i = p
    h = p
    for j = p + 1 to r
        if A[j].b < x.a
            y = A[j] 
            A[j] = A[h + 1]
            A[h + 1] = A[i]
            A[i] = y
            i = i + 1
            h = h + 1
        elseif A[j].a <= x.b
                // Mutually overlapping intervals including the pivot are treated as equal.
                // There could be many such joint sets of intervals, we select one of them.
            x.a = Max(A[j].a, x.a)
            x.b = Min(A[j].b, x.b)
            exchange A[h + 1] with A[j] 
            h = h + 1
    return i, h    // A[i:h] elements fuzzy-equal to x.
```

### b.

Observe that as the intervals overlap more and more, the "equal" sector will grow. These intervals will be omitted from subsequent recursive calls. When all intervals overlap, then after calling `Fuzzy-Partition` once from the top level, the whole array becomes sorted. Therefore, the expected running time reduces to $$\Theta(n)$$.

Otherwise, our algorithm runs in usual $$\Theta(n \lg n)$$ expected time.
