Chapter 5

Exercise 5.1-1

We prove the contrapositive, if we don't have total order on the ranks of the candidates, then we cannot always determine which candidate is best, in line 4 of procedure Hire-Assistant.

A lack of total order implies that there are incomparable candidates. Assume one such pair is denoted by indices $i<j$ and currently $best=i$ . When we reach candidate $j$ , we cannot evaluate the condition in line 4. Consequently, we are unable to decide should we keep $i$ or switch to $j$ .

Exercise 5.1-2

We can employ an algorithm similar to binary search. Let $n=b-a+1$ be a size of the interval $[a,b]$ and assume that one call to Random(0,1) takes a constant amount of time.

Each iteration splits the current range of size $m>1$ into $\lfloor m/2 \rfloor$ and $\lceil m/2 \rceil$ sized chunks, randomly discards one part of it, and continues until the size of the range drops to 1. The number of iterations (expected running time) is $\Theta(\lg n)$ . The probability of $x \in [a,b]$ being selected is $\prod_{i=1}^{\lg n} (1/2)=1/2^{\lg n}=1/n$ , since calls to Random(0,1) can be regarded as Bernoulli trials with probability of success 1/2. Therefore, Random(a,b) ensures uniform distribution over returned values.

Exercise 5.1-3

Available in the latest revision of the IM.

Exercise 5.2-1

Available in the latest revision of the IM.

Exercise 5.2-2

Available in the latest revision of the IM.

Exercise 5.2-3

Available in the latest revision of the IM.

Exercise 5.2-4

Available in the latest revision of the IM.

Exercise 5.2-5

Available in the latest revision of the IM.

Exercise 5.2-6

Available in the latest revision of the IM.

Exercise 5.3-1

Available in the latest revision of the IM.

Exercise 5.3-2

Available in the latest revision of the IM.

Exercise 5.3-3

Available in the latest revision of the IM.

Exercise 5.3-4

Available in the latest revision of the IM.

Exercise 5.3-5

Available in the latest revision of the IM.

Exercise 5.4-1

In both subproblems we assume that birthdays are independent and uniformly distributed.

Subproblem 1

We follow a similar analysis as for the birthday paradox from the book. Let $A_i$ be the event that person $i$ has a different birthday than I do. We have that $\Pr \{A_i\}=1-1/n$ . Let $B_k = \bigcap_{i=1}^kA_i$ be the event that none of the $k$ people has the same birthday as I do. Observe that $\Pr\{A_k|B_{k-1}\}=\Pr\{A_k\}$ . We iteratively apply the recurrence (5.8) to obtain

\begin{align*} \Pr\{B_k\} &= \prod_{i=1}^k\Pr\{A_i\} \\ &= \prod_{i=1}^k\left(1-\frac{1}{n}\right) \\ &= \left(1-\frac{1}{n}\right)^k \end{align*}

We need to find $k$ such that $\Pr\{\overline B_k\} \ge 1/2$ , or $\Pr\{B_k\} \le 1/2$ . This gives $k= \lceil\log_{1-1/n}(1/2)\rceil$ . For $n=365$ it is $k=253$ .

Subproblem 2

Let the random variable $X_k$ denote the number of people (out of $k$ ) having a birthday on July 4. This variable follows the binomial distribution, so $\Pr\{X_k=i\}=b(i;k,1/n)$ . We are looking for the smallest $k$ such that $\Pr\{X_k=0\}+\Pr\{X_k=1\} < 1/2$ . For $n=365$ this gives $k=613$ .

Exercise 5.4-2

We need to find $k$ such that $\Pr\{B_k\}\le1-0.99=0.01$ . Solving for $k$ gives

k \ge \frac{1+\sqrt{1+8n\ln100}}{2}

For $n=365$ it is $k=59$ . Plugging these values into the formula $\frac{k(k-1)}{2n}$ we get, that the the expected number of pairs of people with the same birthday is approximately 4.69.

Exercise 5.4-3

Observe that the first toss will land in an empty bin. Let $X$ be a random variable denoting how many additional tosses are needed until some bin contains two balls. Obviously $\Pr\{X \ge 1\}=1$ , as we surely need at least one extra toss. $\Pr\{X \ge 2\}=(b-1)/b$ , since we must have 1 miss. In general, at least $k$ tosses are required, when none of the first $k-1$ balls ended up in an occupied bin, so

\begin{align*} \Pr\{X\ge k\} &= \frac{b-1}{b}\cdot\frac{b-2}{b}\cdots\frac{b-(k-1)}{b} \\[1mm] &= \frac{b}{b}\cdot\frac{b-1}{b}\cdot\frac{b-2}{b}\cdots\frac{b-(k-1)}{b} \\[1mm] &= \frac{b!}{(b-k)!b^k} \end{align*}

We know that $\Pr\{X=k\}=\Pr\{X \ge k\}-\Pr\{X \ge k+1\}$ and $\Pr\{X \ge b+1\}=0$ . When a random variable $X$ takes on values from the set of natural numbers, then we have a nice formula for its expectation (see equation (C.28) on page 1193 in the book). Therefore, the expected number of tosses is

1+\sum_{k=1}^b \Pr\{X \ge k\} = 1+\sum_{k=1}^b\frac{b!}{(b-k)!b^k}=\Theta(\sqrt b)

The expression above is exactly the same as for finding an average number of people to get at least one shared birthday. You can read more about it here, where you will also find an explanation for the asymptotic growth.

Exercise 5.4-4

It is important that the birthdays be mutually independent.

The conditional probability chain, obtained by iteratively applying the recurrence (5.8), depends on ensuring that, at each step, the birthday variable $b_i$ has the same uniform distribution, unaffected by the previous choices apart from not colliding with earlier birthdays. This chain is only valid when the entire joint assignment is described by uniform, mutually independent random variables. Pairwise independence does not ensure that, once several variables have been assigned, the distribution of the next variable is still as required for the calculation: dependencies among larger groups can profoundly alter probabilities for intersections and unions of events. On page 1188 of the book, you have a concrete example about the difference between pairwise and mutual independence.

Exercise 5.4-5

This sequence lists solutions for variations of the birthday problem depending on how many people should share the same birthday. For this exercise it is $k=88$ . Let us see how we can derive it analytically.

We assume that birthdays are mutually independent and uniformly distributed, so any sequence of $k$ birthdays occurs with probability $1/n^k$ . For $k>1$ and $i=0, 1, \dots, \lfloor k/2\rfloor$ , let $Q_i$ be the probability that among $k$ people exactly $i$ pairs of people have matching birthdays, which implies that no triples of people share birthdays. Our probability $P_k$ that among $k$ people there are triples sharing birthdays is given by

P_k=1-\sum_{i=0}^{\lfloor k/2\rfloor}Q_i

We can select $i$ days out of $n$ , and assign these to $i$ unallocated pairs of people, in $\binom{n}{i}$ ways. The remaining $n-i$ days can be assigned to the remaining $k-2i$ people in $\binom{n-i}{k-2i}$ ways. Since people are distinguishable, the total number of allocations should be multiplied by $k!$ . We also need to divide the resulting number by $2^i$ , since people sharing the same birthday are relatively unordered. After combining these pieces, we get

Q_i = \frac{1}{n^k} \cdot \frac{k!}{2^i}\binom{n}{i}\binom{n-i}{k-2i}

You can verify that $k=88$ entails $Q_i \ge 1/2$ .

Exercise 5.4-6

We can form $n^k$ $k$ -strings over a set of size $n$ . Any $k$ -string with distinct elements is a $k$ -permutation. By equation (C.1), there are $n(n-1)\cdots(n-k+1)$ possible $k$ -permutations. If we denote by $B_k$ the event that $k$ people have distinct birthdays, then the probability that a $k$ -string over a set of size $n$ forms a $k$ -permutation equals $\Pr\{B_k\}$ ; on page 141 of the book you have this in expanded form.

Exercise 5.4-7

Available in the latest revision of the IM.

Exercise 5.4-8

Let's partition the $n$ coin flips into at least $\lfloor n/s \rfloor$ groups of $s=\lfloor \lg n-2\lg\lg n\rfloor$ consecutive flips and bound the probability that no group comes up all heads. By equation (5.9), the probability that the group starting in position $i$ comes up all heads is

\begin{align*} \Pr\{A_{i,\,s}\} &= \cfrac{1}{2^{s}} \\[1mm] &\ge \cfrac{1}{2^{\lg n-2\lg\lg n}} \\[3mm] &= \cfrac{2^{2\lg\lg n}}{2^{\lg n}} \\[3mm] &= \cfrac{\lg^2n}{n} \end{align*}

The probability that a streak of heads of length at least $s$ does not begin in position $i$ is therefore at most $1-(\lg^2n)/n$ . Since the $\lfloor n/s \rfloor$ groups are formed from mutually exclusive, independent coin flips, the probability that everyone of these groups fails to be a streak of length $s$ is at most

\begin{align*} \left(1-\frac{\lg^2n}{n}\right)^{\lfloor n/s \rfloor} &\le \left(1-\frac{\lg^2n}{n}\right)^{n/s-1} \\ &\le \left(1-\frac{\lg^2n}{n}\right)^{n/(\lg n-2\lg\lg n)-1} \\ &\le \left(e^{-(\lg^2n)/n}\right)^{n/(\lg n-2\lg\lg n)} && \text{(by inequality (3.14))} \\ &= e^{-(\lg^2n)/(\lg n-2\lg\lg n)} \\ &< e^{-(\lg^2n)/\lg n} && \text{(for sufficiently large $n$)} \\ &= e^{-\lg n} \\ &= 1/n \end{align*}

Using the same reasoning as for deriving equation (5.13) (see page 148 in the book), the probability that the longest streak equals or exceeds $s$ is $1-1/n$ .

Problem 5-1

Available in the latest revision of the IM.

Problem 5-2

Available in the latest revision of the IM.

A more rigorous treatment of tasks (e) and (f) is based on conditional expectation. Let $X_i=I\{A[i] \text{ is examined}\}$ and $X=\sum_{i=1}^n X_i$ equal the number of indices of A that are examined. Define $S=\{i:A[i]=x\}$ and $\overline S=\{i:A[i] \neq x\}$ , so that $|S|=k$ and $|\overline S|=n-k$ . A position $i \in S$ is examined only if it’s the first position in $S$ , which occurs with probability $1/k$ . A position $i \notin S$ is examined only if, out of position $i$ and all $k$ positions in $S$ , $i$ is the first position, which occurs with probability $1/(k+1)$ . Let $C_S$ be an event that the indices of elements we are searching for belong in $S$ . We have

\begin{align*} \mathbb{E}[X_i] &= \mathbb{E}[\mathbb{E}[X_i|C_S]] \\ &= \sum_{\text{all }S} (\mathbb{E}[X_i|C_S] \Pr\{C_S\}) \\ &= \cfrac{1}{\displaystyle{\binom{n}{k}}} \sum_{\text{all }S} \mathbb{E}[X_i|C_S] && \text{($\Pr\{C_S\}=1/\binom{n}{k}$)} \\ &= \cfrac{1}{\displaystyle\binom{n}{k}} \cdot \binom{n}{k} \cdot \Pr\{X_i|C_S\} && \text{($\mathbb{E}[X_i|C_S]=\Pr\{X_i|C_S\}$)} \\ &= \Pr\{X_i|C_S\} \\ &=\begin{cases} 1/k &\text{if } i \in S, \\ 1/(k+1) &\text{otherwise}. \end{cases} \end{align*}

\begin{align*} \mathbb{E}[X] &= \mathbb{E}[\sum_{i=1}^n X_i] \\ &= \sum_{i=1}^n \mathbb{E}[X_i] && \text{(by linearity of expectation)} \\ &= \sum_{i \in S} \mathbb{E}[X_i] + \sum_{i \notin S} \mathbb{E}[X_i] \\ &= \sum_{i \in S} \frac{1}{k} + \sum_{i \notin S} \frac{1}{k+1}\\ &= k \cdot \frac{1}{k} + (n-k) \cdot \frac{1}{k+1} \\ &= \frac{n+1}{k+1} \end{align*}

PreviousChapter 4 NextChapter 6

Last updated 1 month ago