We apply the rule of product, where we have 2m choices for output per input. In general, the total number of n-input, m-output boolean functions is
2n times2m⋅2m⋅...⋅2m=(2m)2n
There are 22nn-input, 1-output boolean functions.
Exercise C.1-3
If all seatings were unique, we would have n! arrangements. A rotation can be regarded as an equivalence relation on the set of seatings splitting them into equivalence classes (members are some number of rotations apart from each other). Each class contains n seatings and we have nn!=(n−1)! such classes, which answers in how many ways can n professors sit around a circular conference table.
Exercise C.1-4
There are 49 even and 50 odd numbers in the set {1,2,...,99}. A sum of three distinct numbers is even when all of numbers are even or one of them is even whilst the others are odd. We can use equation (C.2) together with the rule of sum and product to calculate the answer.
(349)+(149)(250)=78449
Just for fun, we can also execute the Python 3 script below to verify the result.
from itertools import combinations
print(sum(1 for a, b, c in combinations(range(1, 100), 3)
if (a + b + c) % 2 == 0))
To choose k elements from n elements, we can either decide to use a distinguished element and select the remaining k−1 items from n−1 items or skip it and select k items from the remaining n−1 elements. This establishes the desired relationship.
We use equations (A.1) and (C.2) to prove the identity.
i=1∑ni=2n(n+1)=2!(n−1)!(n+1)!=(2n+1)
Exercise C.1-10
Assume n>0, otherwise the maximum is for k=0. We fix n and treat (kn) as a function of k. First. let us calculate the ratio between successive terms
(k−1n)(kn)=n−k+1n(k−1n−1)kn(k−1n−1)=kn−k+1(by identities from Exercises C.1-5 and C.1-6)
As far as n−k+1≥k, or equivalently k≤2n+1, the function (kn) monotonically increases.
If n is odd, then we have an even number of terms in the expansion for k=0,1,...,n. (kn) is symmetric, so there are 2 candidates (one on each side); k=(n+1)/2=⌈n/2⌉ or k=⌊n/2⌋.
If n is even, then there is only one maximum k=⌊(n+1)/2⌋=⌊n/2⌋=⌈n/2⌉=n/2.
The inequality expresses the number of ways of choosing j+k items out of n by rule of product. Each selection of j items out n is multiplied by the number of ways of choosing additional k items out the remaining n−j. Nonetheless, this approach is erroneous as it counts the same combination multiple times.
For example, (2+15)=10<(25)(13)=30. Suppose we have chosen {a,b} from {a,b,c,d,e}. Apparently, we have 3 options for the additional item. If we pick c, then we get {a,b,c}. But if we start with another initial selection, like {a,c}, then we don't have anymore 3 choices for the next item. If we pick b, then we will overshoot.
Exercise C.1-12
We prove the statement by induction on k. The base case k=0 is true, provided that for convenience we assume 00=1, as mentioned in the book.
For k>0 we have
(kn)=kn(k−1n−1)=kn⋅nn−k+1(k−1n)≤k(k−1)k−1(n−k+1)n−knn(by Exercise C.1-5)(by Exercise C.1-6)(by the inductive hypothesis)
The sequence an=(1+1/n)n is strictly increasing (see the proof here) and we need to find the threshold below which the ratio is less than or equal to one. This happens while k−1≤n−k, or equivalently for all integers k such that k≤(n+1)/2. According to the principles of mathematical induction we can conclude that inequality (C.7) holds for all integers k such that 0≤k≤n/2<(n+1)/2.
When n/2≤k≤n then 0≤n−k≤n/2, so we can apply equation (C.3) after substituting k with n−k in inequality (C.7). Therefore, we can conclude that inequality (C.7) holds for all integers k such that 0≤k≤n.
Exercise C.1-13
This exercises is flawed, since (n2n)=πn22n(1+O(1/n)). The reasons will become clear once we prove that (n2n)=πn22n(1+O′(1/n)) (see Problem 3-6 part (d)).
Let x=α2n−2αn. As n increases x→0, so ex≈1+x. We can derive equation (3.25) from equation (3.29) by leveraging the same approximation. Equation (3.29) is exact and as n increases the adjustment factor ex→1− instead of ex→1+ (you can experiment with small values of n to gain intuition). x is always negative, ∣x∣=O(1/n), hence 1+x=1+O′(1/n), which concludes the proof.
Θ(1/n) stands for a whole family of functions bounded by g(n)=1/n, so we cannot reduce the numerator and denominator in a usual manner. On the other hand, we can upper bound the fraction by bounding the numerator from above and the denominator from below. Let c and d represent constants hidden in O(1/n) and Ω(1/n) of the numerator and denominator, respectively.
So far everything seems correct, thus we may be tempted to jump to the following conclusion
(1+Θ(1/n))21+Θ(1/n)=1+O(1/n)WRONG!
Below is the citation from the book (see page 58) explaining how to treat asymptotic notations appearing on both sides of an equation:
No matter how the anonymous functions are chosen on the left of the equal sign, there is a way to choose the anonymous functions on the right of the equal sign to make the equation valid.
Pick f(n)=1/n as a replacement for Θ(1/n) both in the numerator and denominator. Consequently, the left side becomes 1/(1+(1/n))<1. There is no way to select an asymptotically nonnegative function on the ride side to make it less than 1.
The book talks about "proper" abuses of asymptotic notations, but as this example illustrates, it is easy to become a victim of such convenience.
Exercise C.1-14
We need to calculate the first and second derivative of H(λ) using the base conversion rule for logarithms lgx=lnx/ln2.
We have H′(1/2)=0∧H′′(1/2)<0, so H(λ) achieves its maximum value at λ=1/2. H(1/2)=1.
Exercise C.1-15
The equation trivially holds when n=0. We solve this problem for n>0 by thinking a bit out-of-the-box, which also demonstrates a useful technique for solving seemingly complicated sums. First couple of preliminaries
k=0∑n(kn)kk=1∑n(kn)=k=1∑n(kn)k=2n−1
Draw an n×n table enlisting terms of the sum diagonally, taking into account the multiplicative factor k, as follows:
(1n)
(2n)
(3n)
...
...
(nn)
(2n)
(3n)
...
...
(nn)
...
(3n)
...
...
(nn)
...
...
...
...
...
...
...
(3n)
...
(nn)
...
...
(3n)
(2n)
(nn)
...
...
(3n)
(2n)
(1n)
Each column sums to 2n−1, so the total per the whole table is n(2n−1). Observe that the sums of elements in the upper-left and bottom-right triangular sub-matrixes (they both include the diagonal entries) are the same. We have
We have shown that the equation (C.12) is true for any integer n≥0.
Exercise C.1-16
We can easily verify the cases for n=0,1,2,3 or k=0,1, so assume n≥4∧k>1. We have
(kn)=k!n(n−1)⋯(n−(k−1))=k!nk(1−n1)⋯(1−nk−1)≥k!nk(1−nk−1)k−1>k!nk(1−nn)k−1>k!nk(1−n1)n≥4k!nk(k−1<n)(xn strictly decreases when 0<x<1)(since (1−1/(n))(n)≥1/4 for all n≥4)
Exercise C.2-1
Let event A be that Professor Rosencrantz obtains strictly more heads than Professor Guildenstern. R∗ and G∗ denote various events about results for Professors Rosencrantz and Guildenstern, respectively. We have
APr{A}Pr{A}Pr{A}Pr{A}=(RH≥1∩GT)∪(RH=2∩GH)=Pr{(RH≥1∩GT)∪(RH=2∩GH)}=Pr{RH≥1∩GT}+Pr{RH=2∩GH}=Pr{RH≥1}Pr{GT}+Pr{RH=2}Pr{GH}=43⋅21+41⋅21=21(events are mutually exclusive)(events are independent)
Exercise C.2-2
Define a new sequence B inductively as follows.
Bn=⎩⎨⎧A1An−i=1⋃n−1Biif n=1,if n>1.
This new sequence has the following desired properties:
By definition, for each n, Bn⊆An.
All members of the sequence B are pairwise mutually exclusive. For any distinct indices 1≤i<j≤n it is true that Bj=Aj−(B1∪⋯∪Bi∪⋯∪Bj−1), hence Bi∩Bj=∅.
For each n, ⋃i=1nBi=⋃i=1nAi, which we prove by induction on n. The base cases are n=1∨n=2 and both hold based upon how union of sets is specified. For example, A∪B=A∪(B−A), as duplicates are not allowed. For n>2 we have
i=1⋃nBi=i=1⋃n−1Bi∪Bn=i=1⋃n−1Ai∪Bn=i=1⋃n−1Ai∪(An−i=1⋃n−1Ai)=i=1⋃nAi(by associativity rule)(by the inductive hypothesis)(by the inductive hypothesis)(by the base case)
The last fact implies that ⋃i=1∞Bi=⋃i=1∞Ai. Using the generalized 3rd axiom of probability (see the book) we can prove Boole's inequality for any countably infinite sequence of events (simply replace ∞ with n for the finite case).
Pr{i=1⋃∞Ai}=Pr{i=1⋃∞Bi}=i=1∑∞Pr{Bi}≤i=1∑∞Pr{Ai}(all events are pairwise disjoint)(Pr{Bi}≤Pr{Ai} for all i)
Exercise C.2-3
Let a 3-tuple (i,j,k) represent numbers on the selected cards in order of their removal from the deck. For a fixed j all triples having i<j<k are in sorted (increasing) order; there are (j−1)(10−j) such triples. Let Aj denote an event that a triple for a given j is in sorted order, and A be an event that a triple is in sorted order. We have
APr{A}=j=2⋃9Aj=Pr{j=2⋃9Aj}=j=2∑9Pr{Aj}=j=2∑9(310)3!(j−1)(10−j)=720120=61(events Aj are mutually exclusive)(by equations (A.1), (A.4) and (C.2))
Exercise C.2-4
Pr{A∣B}+Pr{A∣B}=Pr{B}Pr{A∩B}+Pr{B}Pr{A∩B}=Pr{B}Pr{(A∩B)∪(A∩B)}=Pr{B}Pr{B}=1(by equation (C.16))(events in the numerator are mutually disjoint)
Exercise C.2-5
We prove the statement by induction on n. The base case n=1 trivially holds. For n>1 we have
Pr{i=1⋂nAi}=Pr{i=1⋂n−1Ai∩An}=Pr{i=1⋂n−1Ai}Pr{Ani=1⋂n−1Ai}=Pr{A1}Pr{A2∣A1}⋯Pr{Ani=1⋂n−1Ai}(by the associativity rule)(by equation (C.16))(by the inductive hypothesis)
According to the principles of mathematical induction we can conclude that the statement from the book is true for all n≥1.
Exercise C.2-6
Create a 2D sample space S={sij:1≤i,j≤n} with the uniform probability distribution on it. Consequently, Pr{sij}=∣S∣1=n21. The idea is to specify events Ai in such a way that ∣Ai∣=n∧∣Ai∩Aj∣=1∧∣Ai∩Aj∩Ak∣=0 for any indices 1≤i<j<k≤n. Notice that if the probability of any 3-subset of events is zero, then it will be zero for any subset of k>3 of them.
An event Am should only be consisted of outcomes that contain the index m. This ensures Ai∩Aj={sij}, thus no third event could be simultaneously satisfied. It turns out that Ai={sxi:1≤x≤i}∪{six:i<x≤n} does the job. We have
Suppose we have two light bulbs—one is operational, and the other is unreliable and works in 50% of the cases. We randomly pick one of the light bulbs and switch it on/off twice in succession. Let event A denote a successful first attempt to switch a light bulb on/off, and event B a successful second attempt. Let C be the event that the malfunctioning light bulb was picked. We have
The events A and B are not independent, since Pr{A∩B}=(1)⋅(1/2)+(1/2)⋅(1/4)=5/8, while Pr{A}Pr{B}=9/16. But events A and B are conditionally independent given C.
Pr{A∩B∣C}=1/4=(1/2)⋅(1/2)=Pr{A∣C}Pr{B∣C}
Exercise C.2-8
Let J, T and C be the events that Jeff, Tim and Carmine will pass the course, respectively. Without any additional information they have the same probability of 1/3 (a.k.a. prior). Let GR be the event that Professor Gore will respond anything more specific about exam results. Let GC be the event that Professor Gore told Carmine that Jeff is the one who will fail. Observe that this event contains the notion of whom the above information is revealed.
Assumptions
The choice of parameterization highlights a critical aspect of conditional probability problems: the necessity of explicitly defining the decision rules of the agent providing information (here, Professor Gore). The problem does not specify his behavior when both Jeff and Tim fail nor whether he is willing to tell anything more than what he has announced to everybody. Such presumptions affect the probability calculation. For example, Carmine's argument about symmetry, by his claim that Professor Gore "won’t be revealing any information," is only an assumption. We cannot enforce that symmetry on behavior of another agent.
Calculation
We have several conditional probabilities regarding what will Professor Gore announce to Carmine based upon outcomes.
Pr{GR∩GC∣J}Pr{GR∩GC∣T}Pr{GR∩GC∣C}=0=p=q
('1) is zero because if Jeff is supposed to pass the course, then Professor Gore will not lie Carmine (a fair assumption).
(2) It is definitely Jeff who should be selected by Professor Gore, but he may decide not to tell this to Carmine with probability p.
(3) Professor Gore will pick Jeff with probability q.
We want to know what is Carmine's probability of passing the course now that he has found out that Jeff will fail. We apply equation (C.20) to get
Certainly p+q>0, since we already have an evidence that he has pointed on Jeff once.
Results
The tabs below showcase different results due to variations in the initial conditions. In both cases we fix p=1.
Here q=1/2, so Carmine's chance of passing is still 1/3. In the context of similar logic puzzles (e.g., Monty Hall), the intended answer is 1/3, but strictly speaking, it depends on the unknown q (see Problem C-1).
If we set q=1, then it turns out that Carmine's chance of passing becomes 1/2. In this biased scenario, Professor Gore is leaking out more information about outcomes.
Exercise C.3-1
Define the random variable X to be the sum of the two values showing on the dice. We have
Define the random variable X to be the index of the maximum/minimum element in the array A[1:n].
In a randomly ordered array, containing distinct elements, the maximum/minimum element can be at any index with equal probability. In other words, Pr{X=i}=1/n for any valid index i. We have
Define the random variable X to be the number of times the player's number appears on the three dice. The probability that it will appear on some dice is 1/6. Hence, it is 5/6 that it will not. We assume that dice rolls are independent, meaning that we can multiply together individual probabilities to calculate the joint probability for all 3 dice.
Define the random variable Y to be the player's gain (in dollars) from playing the carnival game once. Apparently, Y=f(X) where the function f maps the number of occurrences of the player's number to monetary value (loss/gain). We have
The player will lose about 8 cents on average per game. As a side note, the random variable X follows the binomial distribution with p=1/6.
Exercise C.3-4
For any reals x and y it holds that min(x,y)+max(x,y)=x+y. Therefore, we have
min{X,Y}+max{X,Y}E[min{X,Y}+max{X,Y}]E[min{X,Y}]+E[max{X,Y}]E[max{X,Y}]=X+Y=E[X+Y]=E[X]+E[Y]≤E[X]+E[Y](by linearity of expectation)(E[min{X,Y}]≥0)
Exercise C.3-5
Suppose f(xi)=fX for all i=1,2,…,p and g(yj)=gY for all j=1,2,…,q, where xi and yj are values that random variables X and Y can take, respectively. We have
Pr{f(X)=fX and g(Y)=gY}=Pr{(i=1⋃pX=xi) and (j=1⋃qY=yj)}=Pr{1≤i≤p,1≤j≤q⋃(X=xi and Y=yj)}=i=1∑pj=1∑qPr{X=xi and Y=yj}=i=1∑pj=1∑qPr{X=xi}Pr{Y=yj}=(i=1∑pPr{X=xi})(j=1∑qPr{Y=yj})=Pr{f(X)=fX}Pr{g(Y)=gY}(mutually disjoint events)(independence of X and Y)
We have not assumed anything about fX and gY, so our conclusion about independence of f(X) and g(Y) generalizes to all cases.
We fix t and define two sets A={s∈S:X(s)≥t} and A′={s∈S:X′(s)≥t}. For all s∈S it is true that s∈A′⟹s∈A (because X(s)≥X′(s)), thus A′⊆A. Using the definition of a random variable we get
We have not assumed anything about t, so our conclusion generalizes to all cases.
This exercise introduces the notion of stochastic dominance. In our case, we say that X is stochastically larger than X′, denoted as X≽X′.
Exercise C.3-8
The short answer is E[X2]≥E2[X[. We can prove this either by using Jensen's inequality (C.30), after showing that x2 is a convex function (see below), or by looking at the equation (C.32) and recalling that the variance of a random variable is always nonnegative.
Define the random variable X to be the number of trials needed to obtain a success. The probability of a success is p=b(3;6,1/2)=(36)(1/2)3(1/2)3=20/64=5/16. According to equation (C.36) E[X]=1/p=3.2. Therefore, on average we need to flip ≈3 times six fair coins before obtaining three heads and three tails.
b(k;n,p)=(kn)pkqn−k=(n−kn)qn−kpk=b(n−k;n,q)(by equation (C.3) and p=1−q)
Exercise C.4-5
The binomial distribution b(k;n,p) attains a maximum at integer k that lies in the range np−q≤k≤(n+1)p. As we are looking for an approximation, let k=np and assume it is an integer. The remaining steps are purely mechanical. We have
We prove the identity using the double counting proof technique. The idea is to use different counting approaches to express the same quantity, in this case the probability of professors getting the same number of heads, in two different ways. By equating those expressions we establish the desired relation.
Aggregating the coin flips
Consider each experiment as a sequence of 2n coin flips: the first n by Professor Rosencrantz, the last n by Professor Guildenstern. There are 22n=4n possible sequences. Let "head" be a success for Rosencrantz and "tail" a success for Guildenstern, as hinted in the book. If Professor Rosencrantz has x heads (successes), then he will have n−x tails. Similarly, if Professor Guildenstern has this number of tails (successes), then he would have n−(n−x)=x heads. Therefore, both get the same number of heads if there are exactly n successes in total, which can occur in (n2n) ways. Thus, the desired probability is (n2n)/4n.
Using random variables
Define R and G to be random variables that represent the number of heads and tails for Professors Rosencrantz and Guildenstern, respectively. These variables are independent and follow the binomial distribution. We have
k=0∑nPr{R=kandG=n−k}=k=0∑nPr{R=k}Pr{G=n−k}=k=0∑nb(k;n,1/2)b(n−k;n,1/2)=k=0∑nb(k;n,1/2)2=k=0∑n(kn)2(21)2k(21)2(n−k)=4n∑k=0n(kn)2(by Exercise C.4-4 when p=q)
Exercise C.4-8
As explained in the book, for k=λn, where 0≤λ≤1, we can rewrite equation (C.7) as
The solution depicted here is based on Exercise C.4-10, so you might want to do that first. Define X′ to be the random variable denoting the total number of successes in n Bernoulli trials, where the ith trial has a probability p≥pi of success. According to Exercise C.4-10 for 0≤k≤n we have Pr{X′≥k}≥Pr{X≥k}. Furthermore, X′ follows a binomial distribution b(k;n,p) and Pr{X′≥k}=∑i=knb(i;n,p).
Define the sample space S to be consisted of outcomes as sequences of n Bernoulli trials, where the ith trial has a probability pi of success. A set A in this exercise corresponds to one such sequence. Define Xi to be an indicator random variable associated with the ith trial in A (Xi=1 when it is a success, otherwise 0), so the total number of successes in A can be represented with the random variable X=∑i=1nXi. We devise an experiment involving trials in A to obtain the Bernoulli trials in A′ following the hint from the book. The point is to show that if each trial in one sequence has a probability of success at least as high as those in another, the total successes will be at least equal to the other sequence.
The previously mentioned hint from the book refers to coupling, a powerful proof technique in probability theory. The fundamental idea is to introduce coupled random variables that could be used to compare the original random variables having different probability distributions. In our case, we cannot directly relate sets A and A′, as they contain Bernoulli trials with different distributions. However, we can derive a new set of indicator random variables Xi′, conditionally coupled on Xi, such that they faithfully reflect the probability distributions of random trials in A′. Therefore, the total number of successes in A′ can be represented with the random variable X′=∑i=1nXi′ and we can compare X and X′ using the same sample space.
Define
Xi′={1Bernoulli(ri)if Xi=1,if Xi=0,
where ri=1−pipi′−pi. Notice that pi<1 when Xi=0, otherwise Xi would be 1. For the marginal distribution of Xi′ we have
By construction, for all i we have Xi′≥Xi, hence X′≥X. Furthermore, the coupling preserves the independence of the trials within each sequence because the construction for each i depends only on Xi (and independent randomness for the conditional Bernoulli when Xi=0).
According the Exercise C.3-7 we have ∀s∈S,(X′(s)≥X(s)⟹Pr{X′≥k}≥Pr{X≥k}) for 0≤k≤n. This concludes the proof of the statement from the book.
Exercise C.5-1
We expect to have a greater chance to obtain exactly n heads in 2n flips of a fair coin, since, on average, we would expect equal number of heads and tails. We confirm this by calculating the probabilities. Define Xm to be a random variable denoting the number of heads in m>0 flips of a fair coin. Apparently, it has a binomial distribution b(k;m,1/2), thus for n>1 we have
As a side note, when n=1 then the two probabilities are equal.
Exercise C.5-2
We prove Corollaries (C.6) and (C.7) following the advice from the book; by inverting the roles of successes and failures in equations for the left tail of a binomial distribution.
We see that p=a/(a+1) and q=1−p=1/(a+1), so a=p/q. Notice also that 0<k<np. Define X to be a random variable with a binomial distribution b(k;n,p). We have
For the second part just replace μ=E[X]=np in the formula above and simplify n−np=nq.
Exercise C.5-6
We follow the hint from the book and show that pieαqi+qie−αpi≤eα2/2 for all nonnegative reals pi, qi and α such that pi+qi=1. The left hand side of the inequality is the moment generating function of a real-valued random variable Yi=Xi−pi, where Xi is an indicator random variable taking value of 1 with probability pi, as described in the book. Apparently, E[Yi]=0 and −pi≤Yi≤qi , thus it is bounded. Therefore, we can apply the Hoeffding's lemma (see the attached PDF, which was downloaded on August, 2025 from here), which says
We have a standard quadratic polynomial in terms of α that achieves its minimum value at α=r/n. We finally have
Pr{X−μ≥r}≤exp((r/n)2n/2−(r/n)r)=e−r2/2n
Exercise C.5-7
The function f(x)=eg(x) is monotonically increasing in terms of g(x), so minimizing g(x) also minimizes f(x). Therefore, we need to find the first and second derivates of the the exponent in the right-hand side of inequality (C.51); we have
f′(α)f′′(αmin)=μeα−r=0⟹αmin=ln(r/μ)=μeαmin>0∴αmin is minimum, assuming μ>0
Problem C-1
a.
See the solution here. The site contains lots of interesting puzzles worthwhile to study.
b.
The first three columns constitute the components of an outcome.
Initial Pick
Has Offer?
Switch?
Winning?
Probability
Right
Yes
Yes
No
prightpswitch/3
Right
Yes
No
Yes
pright(1−pswitch)/3
Wrong
Yes
Yes
Yes
2pwrongpswitch/3
Wrong
Yes
No
No
2pwrong(1−pswitch)/3
Right
No
No
Yes
(1−pright)/3
Wrong
No
No
No
2(1−pwrong)/3
c.
Just sum up the probabilities of the winning outcomes.
d.
He should minimize the expression from part (c). Therefore, his best choices are pright=1 and pwrong=0.
e.
When pswitch=0 then Monty cannot influence the player's probability of winning a car, so all possible strategies are optimal for him.
f.
We should maximize the expression from part (c). Therefore, we have
pswitch={10if 2pwrong≥pright,otherwise.
g.
The combination pright=1 and pwrong=0 minimizes the probability of winning as a function of these variables. In this case, pwinning=(1−pswitch)/3, so setting pswitch=0 maximizes this probability.
h.
Let event P designate that our initial pick is correct, event O that Monty offers us an opportunity to switch, and event Wthat we win a car. The total probability of getting an offer is
We know that Monty gives an offer, so pright>0∨pwrong>0, hence pright+2pwrong>0.
Define event S to stand for our decision to switch. The joint event of winning a car and having an opportunity to switch is
W∩O=((P∩S)∪(P∩S))∩O=((P∩O)∩S)∪((P∩O)∩S)
As stated in the problem description, we have no knowledge of Monty's possible motivations or strategies. Consequently, S is independent of the other events. Thus, we have
When pswitch=1/2 then the value of the expression (C.52) is 1/2. When pswitch<1/2 then Monty can choose pright=0 and pwrong=1, otherwise, if pswitch>1/2 then he can select pright=1 and pwrong=0. In both cases he can reduce the probability below 1/2.
j.
Not knowing anything about a problem's setup, like in this case pertaining to Monty's motivations and strategies around giving opportunities for switching, our best tactic in yes/no decision making situations is simply to flip a coin. Such a Bernoulli type random experiment has a great potential to annihilate sophisticated methods of other participants in reducing our chances of winning.
This problem has highlighted how subtle variations in assumptions may significantly influence the output. Furthermore, we have seen that probability theory is full of surprises compared to our everyday intuition. As a matter of fact, it has demonstrated that we are, in general, inapt in handling uncertainties and reasoning about random processes.
Problem C-2
a.
There are b choices for each ball resulting in bn ways of putting them into bins.
b.
We follow the hint from the book. There are (b+n−1)! possible arrangements (permutations) of n distinct balls and b−1 indistinguishable sticks. Sticks essentially partition the balls into bins, possibly allowing empty bins, too. For example, |21||3 says that bins 1 and 3 are empty, bin 2 contains the balls 2,1 (in this specific order), and ball 3 is in the last bin. Since the sticks are indistinguishable, we must divide the total number of arrangements with (b−1)! to get the result from the book.
c.
We again follow the hint from the book. Of the arrangements in part (b), n! are repeated if the balls are made identical. Therefore, we have n!(b−1)!(b+n−1)!=(nb+n−1).
d.
Out of b bins we select n of them that will contain a ball. This is a classical example of counting combinations, so the answer is (nb).
e.
We reuse the idea from part (b) with b−1 indistinguishable sticks. The only difference is that we now limit the placement (positions) of those sticks to be always between some number of balls. There are n−1 allowable positions (between any neighboring numbers) for sticks. From these we select b−1 of them, thus we get (b−1n−1).