Chapter 5
Exercise 5.1-1
We prove the contrapositive, if we don't have total order on the ranks of the candidates, then we cannot always determine which candidate is best, in line 4 of procedure Hire-Assistant.
A lack of total order implies that there are incomparable candidates. Assume one such pair is denoted by indices i<j and currently best=i. When we reach candidate j, we cannot evaluate the condition in line 4. Consequently, we are unable to decide should we keep i or switch to j.
Exercise 5.1-2
We can employ an algorithm similar to binary search. Let n=b−a+1 be a size of the interval [a,b]and assume that one call to Random(0,1) takes a constant amount of time.
Each iteration splits the current range of size m>1 into ⌊m/2⌋ and ⌈m/2⌉ sized chunks, randomly discards one part of it, and continues until the size of the range drops to 1. The number of iterations (expected running time) is Θ(lgn). The probability of x∈[a,b] being selected is ∏i=1lgn(1/2)=1/2lgn=1/n, since calls to Random(0,1) can be regarded as Bernoulli trials with probability of success 1/2. Therefore, Random(a,b) ensures uniform distribution over returned values.
Exercise 5.1-3
Available in the latest revision of the IM.
Exercise 5.2-1
Available in the latest revision of the IM.
Exercise 5.2-2
Available in the latest revision of the IM.
Exercise 5.2-3
Available in the latest revision of the IM.
Exercise 5.2-4
Available in the latest revision of the IM.
Exercise 5.2-5
Available in the latest revision of the IM.
Exercise 5.2-6
Available in the latest revision of the IM.
Exercise 5.3-1
Available in the latest revision of the IM.
Exercise 5.3-2
Available in the latest revision of the IM.
Exercise 5.3-3
Available in the latest revision of the IM.
Exercise 5.3-4
Available in the latest revision of the IM.
Exercise 5.3-5
Available in the latest revision of the IM.
Exercise 5.4-1
In both subproblems we assume that birthdays are independent and uniformly distributed.
Subproblem 1
We follow a similar analysis as for the birthday paradox from the book. Let Ai be the event that person i has a different birthday than I do. We have that Pr{Ai}=1−1/n. Let Bk=⋂i=1kAi be the event that none of the k people has the same birthday as I do. Observe that Pr{Ak∣Bk−1}=Pr{Ak}. We iteratively apply the recurrence (5.8) to obtain
We need to find k such that Pr{Bk}≥1/2, or Pr{Bk}≤1/2. This gives k=⌈log1−1/n(1/2)⌉. For n=365 it is k=253.
Subproblem 2
Let the random variable Xk denote the number of people (out of k) having a birthday on July 4. This variable follows the binomial distribution, so Pr{Xk=i}=b(i;k,1/n). We are looking for the smallest k such that Pr{Xk=0}+Pr{Xk=1}<1/2. For n=365 this gives k=613.
Exercise 5.4-2
We need to find k such that Pr{Bk}≤1−0.99=0.01. Solving for k gives
For n=365 it is k=59. Plugging these values into the formula 2nk(k−1) we get, that the the expected number of pairs of people with the same birthday is approximately 4.69.
Exercise 5.4-3
Observe that the first toss will land in an empty bin. Let X be a random variable denoting how many additional tosses are needed until some bin contains two balls. Obviously Pr{X≥1}=1, as we surely need at least one extra toss. Pr{X≥2}=(b−1)/b, since we must have 1 miss. In general, at least k tosses are required, when none of the first k−1 balls ended up in an occupied bin, so
We know that Pr{X=k}=Pr{X≥k}−Pr{X≥k+1} and Pr{X≥b+1}=0. When a random variable X takes on values from the set of natural numbers, then we have a nice formula for its expectation (see equation (C.28) on page 1193 in the book). Therefore, the expected number of tosses is
The expression above is exactly the same as for finding an average number of people to get at least one shared birthday. You can read more about it here, where you will also find an explanation for the asymptotic growth.
Exercise 5.4-4
It is important that the birthdays be mutually independent.
The conditional probability chain, obtained by iteratively applying the recurrence (5.8), depends on ensuring that, at each step, the birthday variable bi has the same uniform distribution, unaffected by the previous choices apart from not colliding with earlier birthdays. This chain is only valid when the entire joint assignment is described by uniform, mutually independent random variables. Pairwise independence does not ensure that, once several variables have been assigned, the distribution of the next variable is still as required for the calculation: dependencies among larger groups can profoundly alter probabilities for intersections and unions of events. On page 1188 of the book, you have a concrete example about the difference between pairwise and mutual independence.
Exercise 5.4-5
This sequence lists solutions for variations of the birthday problem depending on how many people should share the same birthday. For this exercise it is k=88. Let us see how we can derive it analytically.
We assume that birthdays are mutually independent and uniformly distributed, so any sequence of k birthdays occurs with probability 1/nk. For k>1 and i=0,1,…,⌊k/2⌋, let Qi be the probability that among k people exactly i pairs of people have matching birthdays, which implies that no triples of people share birthdays. Our probability Pk that among k people there are triples sharing birthdays is given by
We can select i days out of n, and assign these to i unallocated pairs of people, in (in) ways. The remaining n−i days can be assigned to the remaining k−2i people in (k−2in−i) ways. Since people are distinguishable, the total number of allocations should be multiplied by k!. We also need to divide the resulting number by 2i, since people sharing the same birthday are relatively unordered. After combining these pieces, we get
You can verify that k=88 entails Qi≥1/2.
Exercise 5.4-6
We can form nk k-strings over a set of size n. Any k-string with distinct elements is a k-permutation. By equation (C.1), there are n(n−1)⋯(n−k+1) possible k-permutations. If we denote by Bk the event that k people have distinct birthdays, then the probability that a k-string over a set of size n forms a k-permutation equals Pr{Bk}; on page 141 of the book you have this in expanded form.
Exercise 5.4-7
Available in the latest revision of the IM.
Exercise 5.4-8
Let's partition the n coin flips into at least ⌊n/s⌋ groups of s=⌊lgn−2lglgn⌋ consecutive flips and bound the probability that no group comes up all heads. By equation (5.9), the probability that the group starting in position i comes up all heads is
The probability that a streak of heads of length at least s does not begin in position i is therefore at most 1−(lg2n)/n. Since the ⌊n/s⌋ groups are formed from mutually exclusive, independent coin flips, the probability that everyone of these groups fails to be a streak of length s is at most
Using the same reasoning as for deriving equation (5.13) (see page 148 in the book), the probability that the longest streak equals or exceeds s is 1−1/n.
Problem 5-1
Available in the latest revision of the IM.
Problem 5-2
Available in the latest revision of the IM.
A more rigorous treatment of tasks (e) and (f) is based on conditional expectation. Let Xi=I{A[i] is examined} and X=∑i=1nXi equal the number of indices of A that are examined. Define S={i:A[i]=x} and S={i:A[i]=x}, so that ∣S∣=k and ∣S∣=n−k. A position i∈S is examined only if it’s the first position in S, which occurs with probability 1/k. A position i∈/S is examined only if, out of position i and all k positions in S, i is the first position, which occurs with probability 1/(k+1). Let CS be an event that the indices of elements we are searching for belong in S. We have
Last updated