In all subproblems, {ak}k≥m denotes a sequence am,am+1,…; for example, {k3}k≥2 starts with 8 (without leading zeros).
Sequence {2k+1}k≥0
We can represent {2k+1}k≥0 as 2{2k}k≥0, which is 2x an elementary OGF from Table 3.1 with c=2. Thus, 2/(1−2z)=k≥0∑2k+1zk. Observe, that we can attain the same result by left shifting the elementary OGF 1/(1−2z) instead of first pulling out a constant factor 2.
Sequence {k2k+1}k≥0
We can represent {k2k+1}k≥0 as 2{k2k}k≥0, which is a scaled version of an elementary OGF from Table 3.1. Thus, 4z/(1−2z)2=k≥0∑k2k+1zk.
Sequence {kHk}k≥1
Let A(z)=∑k≥0k(Hk−1)zk and B(z)=∑k≥0kzk, so that our sequence is their left shifted sum (both summands are elementary OGFs from Table 3.1). We have
The idea behind each subproblem is to find the root OGF from Table 3.1 and then see what operations have been performed on it from Table 3.2.
[zN] for (1−3z)41
The root OGF is a sequence 1,c,c2,… (with c=3) and its convolution with itself is (1−3z)21. This is then further convoluted with itself to produce the final OGF. The result of the first stage is
(1−3z)21=N≥0∑(k≥0∑3k3N−k)zN=N≥0∑(N+1)3NzN.
Repeating the above steps again, we get
[zN](1−3z)41=61(N+1)(N+2)(N+3)3N.
[zN] for (1−z)2ln(1−z)1
The difference operation is applied twice in succession on the elementary OGF. We have to be careful to avoid a division by zero error. For this reason, the first few terms are handled separately. We have
The starting point is the elementary OGF 1−z21=∑N≥0z2N. It is scaled by λ=2 to get 1−2z21. Finally, it is convoluted with itself to produce
[zN](1−2z2)21={(2N+1)2N/20if N is even,if N is odd.
Exercise 3.3
(1−z)1ln(1−z)1(1−z)2ln(1−z)1+1(1−z)2z(ln(1−z)1+1)(1−z)2zln(1−z)1(1−z)2zln(1−z)1=N≥1∑HNzN=N≥1∑NHNzN−1=N≥1∑NHNzN=N≥1∑N(HN−1)zN=N≥0∑N(HN−1)zN(differentiating)(right shifting)(subtracting z/(1−z)2=∑N≥1NzN)(altering index for N).
The last step wasn’t really needed, since any "missing" term has a coefficient of zero.
Exercise 3.4
The solution is available as part of the course material on the book’s website (see slide 13). The key takeaway is that the LHS of an OGF may be conveniently factored to craft the desired RHS. For example, the OGF 1/(1−z)2=∑N≥0(N+1)zN is a left shifted elementary OGF from Table 3.1. It simply appeared as a result of applying the partial sum operation on another OGF. Nonetheless, its convolution with ln1−z1 produced a straightforward sum on the RHS that we were able to handle.
Exercise 3.5 🌟
This exercise demonstrates the advantages of representing an entire sequence using a single mathematical construct, such as a generating function. By manipulating generating functions, we can transform sequences and derive new identities through factoring. Each factorization corresponds to the same sequence, providing multiple perspectives for describing its terms. The preceding exercise has already illustrated this methodology in practice.
We need to prove a general identity satisfied by the harmonic numbers and binomial coefficients by factoring the initial OGF in different ways. Let’s start with
A(z)(1−z)M+1zM⋅B(z)ln1−z1.
Both components are elementary OGFs from Table 3.1. We have by definition
[zN](A(z)B(z))=k=M∑N−1(Mk)N−k1.
Looking at the initial OGF, we see that it is
C(z)(1−z)M+11⋅D(z)ln1−z1
right shifted M times, or from an opposite direction, the latter is a result of left shifting the original one the same number of times. Therefore, [zN](A(z)B(z))=[zN−M](C(z)D(z)). The book Concrete Mathematics: A Foundation for Computer Science (2nd ed.) contains an identity (7.43)
C(z)D(z)=N≥0∑(HM+N−HM)(NM+N)zN.
Therefore,
M≤k<N∑(Mk)N−k1=(HN−HM)(N−MN).
Exercise 3.6
We are given the sequence
{0<k<n∑k(n−k)1}n>1,
and we need to find its OGF. We presume the same interpretation of the above definition as in Exercise 3.1. It’s doubly left shifted self-convolution of an elementary OGF ln1−z1.
A(z)=z2ln21−z1=z2(−ln(1−z))2=z2ln2(1−z).
Exercise 3.7
We need to find the OGF for {Hk/k}k≥1. Assume the same interpretation of this definition as in the previous exercise. A recipe for crafting the required OGF is as follows:
Start with the elementary OGF for the harmonic numbers.
Left shift it.
Apply the index divide (integration) operation.
Left shift again.
Despite being simple at first blush, the third step requires some advanced knowledge from calculus. The first two steps are
We can’t directly integrate the LHS, since we have an improper integral (isn’t defined at zero). Let’s break it up and compute the components individually. We get
Let’s tackle first the common OGF A(z)=ln21−z1 that appears in both subtasks. As usual, there are many equivalent expressions for [zN]. Let’s reuse the results from Exercise 3.6 and Exercise 3.7.
[zN]A(z)=0<k<N∑k(N−k)1=N2HN−1 for N>1.
For N≤1 the coefficients are zero.
[zN] for 1−z1ln21−z1
We apply the partial sum operation on A(z) to get B(z)=A(z)/(1−z). For pedagogical reasons, we show here another derivation of the identity stated above for A(z) (see step 3). We have
[zN]B(z)=1<i≤N∑0<j<i∑j(i−j)1=1<i≤N∑i10<j<i∑(j1+i−j1)(j(i−j)1=i1(j1+i−j1))=1<i≤N∑i2Hi−1=21<i≤N∑iHi−i1=2i=1∑NiHi−2i=1∑Ni21(set i to start at 1 without influencing the sum)=2(21(HN2+HN(2)))−2HN(2)(by identity (6.71))∴[zN]1−z1ln21−z1=[zN](ln1−z1)(1−z1ln1−z1)⟹k=1∑N−1kHN−k=HN2−HN(2).
In all subproblems, {ak}k≥m denotes a sequence am,am+1,…; for example, {k3}k≥2 starts with 8 (without leading zeros).
Sequence {2k+1}k≥0
We can represent {2k+1}k≥0 as 2{2k}k≥0, which is 2x an elementary EGF from Table 3.3 with c=2. Thus, 2e2z=k≥0∑2k+1k!zk. Observe, that we can attain the same result by left shifting the elementary EGF e2z instead of first pulling out a constant factor 2.
Sequence {k2k+1}k≥0
We can represent {k2k+1}k≥0 as 2{k2k}k≥0. For the scaled sequence, the starting point is e2z. Notice that in this sequence ak=2ak−1 for k>1. Next, we need to perform an index multiply operation. To get kak instead of kak−1, we must again multiply by 2. Thus, 4ze2z=k≥0∑k2k+1k!zk.
Sequence {k3}k≥2
Notice that k3=6(3k)+6(2k)+k, so that our sequence is doubly left shifted sum of the corresponding elementary EGFs from Table 3.3. We have
By definition A(z)=∑k≥0akk!zk. We also know that
N![zN]B(z)=0≤k≤N∑N!k!ak=0≤k≤N∑(kN)ak(N−k)!,
where B(z) is the unknown EGF. The last summation suggests that it’s a binomial convolution of A(z) and 1/(1−z). Hence,
B(z)=A(z)1−z1.
Exercise 3.14 🌟
This exercise introduces the Laplace transform for converting from the OGF for a sequence to the EGF for the same sequence, and vice versa.
By definition A(z)=∑k≥0akk!zk. We must prove that
∫0∞A(zt)e−tdt=k≥0∑akzk,
provided the integral exists. The conversion is based on the Laplace transform, as mentioned in the book. The proof revolves around computing the definite integral. We have
∫0∞A(zt)e−tdt=∫0∞(k≥0∑aktkk!zk)e−tdt=k≥0∑akk!zkΓ(k+1)=k!∫0∞tke−tdt=k≥0∑akk!zk⋅k!=k≥0∑akzk.(assuming convergence to flip ∑ and ∫)
The second line uses the notion of a gamma function. This establishes the desired identity.
Examples
If A(z)=ecz then
∫0∞A(zt)e−tdt=∫0∞e−(1−cz)tdt=1−cz1.
Type the following snippet into Sage to evaluate the above integral:
If A(z)=zez then
∫0∞ztezte−tdt=(1−z)2z.
Type the following snippet into Sage to evaluate the above integral:
Exercise 3.15
We can read this out from Table 3.4 as well as by looking at the first example in the previous exercise. Thus, the EGF for the Fibonacci numbers is
F(z)=51(eϕz−eϕ^z).
If we want to derive it directly, without using the corresponding OGF, then we need to multiply the recurrence by zn/n! and sum on n in the first step. Now, F(z)=∑k≥0Fkk!zk, hence we get the following differential equation (recall that right shifting is integration)
with initial conditions F(0)=F0=0 and F′(0)=F1=1. For the sake of variety. let’s solve it using WolframAlpha. After issuing the command y'' - y' - y = 0, y(0) = 0, y'(0) = 1 we get the same result.
Exercise 3.16 🌟
We describe the process in detail for the first example. The extra summand, which makes a recurrence valid for all n, is the Kronecker delta function.
ananA(z)=−an−1+6an−2=−an−1+6an−2+δn1=−zA(z)+6z2A(z)+z=1+z−6z2z=(1−2z)(1+3z)z=51(1−2z1−1+3z1)∴an=512n−51(−3)n for n>1 with a0=0 and a1=11. Make it valid for all n.2. Multiply by zn and sum on n.3. Solve.4. Simplify.5. Expand.
Performing a partial fractions expansion is easy in WolframAlpha. For example, issue Apart[z/(1 + z - 6z^2)] to simplify the polynomial expression in the 4th step (see above).
We use EGF for the next recurrence to demonstrate how it unfolds and to avoid technical difficulties in implementing partial fractions expansion due to unwieldy roots. Solving a homogeneous third-order linear ordinary differential equation with constant coefficients is essentially the same as solving the corresponding recurrence via roots of the characteristic polynomial. The resulting EGF is easily mapped to a closed-form solution (see the previous two exercises).
anA′′′(z)A(z)=11an−2−6an−3 for n>2 with a0=0 and a2=a1=1=11A′(z)−6A(z)with A(0)=0,A′(0)=1,A′′(0)=1...find the roots r1,r2,r3 of the characteristic equation z3−11z+6=0...=c0er1z+c1er2z+c2er3z...solve for constants based on initial conditions...=41e3z+136−17−17e(2−3+17)z+136−17+17e(2−3−17)z∴an=41⋅3n+136−17−17(2−3+17)n+136−17+17(2−3−17)n.
The next two examples have complex roots.
ananA(z)=3an−1−4an−2 for n>1 with a0=0 and a1=1=3an−1−4an−2+δn1=3zA(z)−4z2A(z)+z=1−3z+4z2z=7i(1−(3/2−i7/2)z1−1−(3/2+i7/2)z1)∴an=7i((23−i7)n−(23+i7)n)
Using the technique illustrated in Exercise 2.31, we can convert the above expression into trigonometric form. Thus, an=72n+1sin(narccos(43)).
ananA(z)=an−1−an−2 for n>1 with a0=0 and a1=1=an−1−an−2+δn1=zA(z)−z2A(z)+z=1−z+z2z=3i(1−(1/2−i3/2)z1−1−(1/2+i3/2)z1)∴an=3i((21−i3)n−(21+i3)n)
The trigonometric form is an=32sin(narccos(21))=32sin(n3π).
Exercise 3.17 🌟
We need to solve the recurrence
an=5an−1−8an−2+4an−3 for n>2 with a0=1, a1=2 and a2=4.
The book provides a solution for different initial conditions, but we can reuse some parts of it. More specifically,
This is an elementary OGF from Table 3.1. We have an=2n.
Exercise 3.18 🌟
ang(z)f(z)a(z)=2an−2−an−4 for n>3 with a0=a1=0 and a2=a3=1=1−2z2+z4=(1−z)2(1+z)2=z2+z3=z2(1+z)=(1−z)2(1+z)z2=(1−z)2z⋅1+zz.
The resulting OGF is a convolution of basic OGFs. We have
[zN](1−z)2z⋅1+zz=1≤k<N∑k(−1)N−k−1=41(−1)N(2N(−1)N−(−1)N+1)={2N2N−1if N is even,if N is odd.
The above sum can be computed by entering Sum[k(-1)^(N-k-1),{k,1,N-1}] in WolframAlpha. It may be instructive to dissect the sum in detail, but this is the topic of other books, like, Concrete Mathematics: A Foundation for Computer Science (2nd ed.). Afterward, we can further simplify the result by considering parity.
We can formulate a concise answer as
an=2n+4(−1)n−1.
Exercise 3.19 🌟
ang(z)f(z)a(z)=6an−1−12an−2+18an−3−27an−4 for n>3 with a0=0 and a1=a2=a3=1=1−6z+12z2−18z3+27z4=(1−3z)2(1+3z2)=z−5z2+7z3=(1−3z)2(1+3z2)z−5z2+7z3=72(1−3z)1+36(1−3z)21+24(1+3z2)19z−1=721⋅(1−3z1)+z1⋅3⋅361⋅((1−3z)23z)+z⋅2419⋅(1−(i3z)21)−241⋅(1−(i3z)21).
Again, tool support is indispensable. The decomposition of a(z) was done in WolframAlpha by issuing the Apart[(z - 5 z^2 + 7 z^3)/((1 - 3 z)^2 (1 + 3 z^2))] command.
The resulting OGF is a sum of scaled and/or shifted elementary OGFs. After simplifying terms, and reusing the analysis from the book about OGFs of the form 1/(1+z2), we get
This exercise pinpoints an important fact about partial fractions expansion. Namely, sometimes it makes sense to work with higher degree polynomials. We wouldn’t gain anything by breaking up (1+3z2), since it is already a scaled elementary OGF from Table 3.1.
Exercise 3.20 🌟
ang(z)f(z)a(z)=3an−1−3an−2+an−3 for n>2 with a0=a1=0 and a2=1=1−3z+3z2−z3=(1−z)3=z2=(1−z)3z2∴an=(2n).
ang(z)f(z)a(z)=3an−1−3an−2+an−3 for n>2 with a0=0 and a2=a1=1=1−3z+3z2−z3=(1−z)3=z−2z2=(1−z)3z−2z2=z1⋅(1−z)3z2−2(1−z)3z2∴an=(2n+1)−2(2n)=2n(3−n).
Exercise 3.21
Let A(z)=∑n≥0anzn represent our sequence. We know that
Looking at the form of the coefficient for n≥t, it’s definitely a convolution of some OGF B(z) with A(z). Therefore, we need to find this unknown OGF and substitute it into the expression:
A(z)=zt−1−B(z)A(z).
Observe the term zt−1, which denotes the initial value at−1=1. The convolution only covers the case when n≥t. The range of k tells that the sequence represented with B(z) has the following values:
0,1≤k≤t−t,(2t),…,(−1)t,0,…
By searching Table 3.1, it occurs that B(z)=(1−z)t−1, since it is a scaled elementary OGF with a leading zero. Thus,
A(z)=(1−z)tzt−1⟹an=(t−1n) for n≥0.
This exercise nicely demonstrates the power of generating functions. Imagine solving this complicated recurrence using "classical" methods.
Exercise 3.22 🌟
Let a(z)=∑n≥1anzn with an implicit assumption that a0=0.
nann≥2∑nanznza′(z)−zza′(z)−za′(z)−1(1−z)a′(z)=(n−2)an−1+2 for n>1 with a1=1=n≥2∑(n−2)an−1zn+2n≥2∑zn=n≥2∑nan−1zn−2n≥2∑an−1zn+2(1−z1−z−1)=z(za(z))′−2za(z)+21−zz2=a(z)+za′(z)−2a(z)+21−zz=−a(z)+1−z1+z.
We obtain the solution to this differential equation by solving the corresponding homogeneous equation
ϱ(z)ϱ′(z)=−1−z1,
to get an “integration factor” ϱ(z)=1−z. This gives
(1−za(z))′=(1−z)2a′(z)(1−z)+a(z)=(1−z)31+z.
Integrating (for example, by using WolframAlpha), we get the result
a(z)=1−zz⟹an=1 for n>0.
We have already encountered this recurrence in Exercise 2.11. This was another way to attain the same outcome.
Exercise 3.23
Let a(z)=∑n≥0anzn. We assume that t is a positive integer.
nann≥1∑nanznza′(z)za′(z)za′(z)a′(z)a′(z)a(z)a′(z)=(n+t−1)an−1 for n>0 with a0=1=n≥1∑(n+t−1)an−1zn=n≥1∑nan−1zn+(t−1)n≥1∑an−1zn=z(za(z))′+(t−1)za(z)=z2a′(z)+za(z)+(t−1)za(z)=za′(z)+a(z)+(t−1)a(z)=za′(z)+ta(z)=1−zt∴a(z)=(1−z)t1=zt−11⋅(1−z)tzt−1.
Thus, a(z) is an elementary OGF, left shifted t−1 times. This gives
an=(t−1n+t−1) for n≥0.
Exercise 3.24
Fast forwarding the analysis from the book, our recurrence relationship corresponds to a differential equation on the generating function
a′(z)=(1−z)32+t1−za(z).
We obtain the solution to this differential equation by solving the corresponding homogeneous equation
ϱ(z)ϱ′(z)=1−zt,
to get an “integration factor” ϱ(z)=1/(1−z)t. This gives
Integrating (for example, by using WolframAlpha), we get the result (recall that a(0)=a0=0)
a(z)=t−2−2⋅(1−z)21+t−22⋅(1−z)t1.
The first term is a left shifted elementary OGF, so it’s coefficient is proportional to n. The second term is t−1 times left shifted elementary OGF from Table 3.1, so behaves like (t−1n+t−1)∼nt−1. We differentiate two cases:
If t=2−ϵ, then an=Θ(n)=o(nlgn).
If t=2+ϵ, then an=Θ(n1+ϵ)=ω(nlgn).
The threshold t=2 is a bifurcation point, where even a tiny disturbance entails an asymptotically different behavior.
We already know the result for the exponential sequence. Since 2z=eln2z=ezln2, we have to substitute z=zln2 in the formula from the book. We get
2z=k≥0∑k!lnk2zk.
Expansion of zez
We just need to multiply each term on the RHS of the expansion of ez with z. This gives
zez=k≥1∑(k−1)!zk.
Exercise 3.26 🌟
This exercise introduces the general Leibniz rule pertaining to the rule for the nth derivative of the product of two functions.
Let’s develop some intuition about the stated recurrence for coefficients. Tools are excellent in performing exploratory tasks like this. Issuing taylor series 1/(1-az-bz^2) in WolframAlpha gives us the terms of the expansion:
Now we evaluate the differentiated equation above at z=0. This gives
f(n)(0)−a⋅n⋅f(n−1)(0)−b⋅n(n−1)⋅f(n−2)(0)cnn!f(n)(0)−acn−1(n−1)!f(n−1)(0)−bcn−2(n−2)!f(n−2)(0)=0=0(dividing by n!)
This concludes the proof about the recurrence relation for coefficients.
Exercise 3.27
As in the previous exercise, we rely on the general Leibniz rule. Our starting point is
H(z)=f(z)1−z1⋅g(z)ln1−z1.
We have to find cn=H(n)(0)/n!. Notice that g′(z)=f(z)⟹g(n−1)(0)=f(n)(0). We also know from the book that the kth derivative of f(z) is k!(1−z)−k−1. Consequently, f(k)(0)=k!. Finally, g(0)(0)=g(0)=0, so the summation in the Leibniz rule starts at index k=1. Combining all these facts, we have
Let’s follow the hint from the book and seek for the expansion of (1−z)−α. It’s given in the article on Wikipedia about the negative binomial series. We have
As a quick check, for α=2 we indeed get the expansion of a left shifted OGF from Table 3.1.
In our case, α=1/2. This gives
[zn]1−z1ln1−z1=(nn−21)(Hn−21−H−21).
We can use the identity below, pertaining to the generalized binomial coefficients, to simplify the first term in the RHS. The book already shows the derivation of (n−21).
We already have an expression for a generic parameter α (see the previous exercise). Thus,
[zn](1−z)t1ln1−z1=(nn+t−1)(Hn+t−1−Ht−1).
Exercise 3.30
The identity is trivial to derive by noticing that
[zN]1−4z1⋅1−4z1=[zN]1−4z1.
Both OGFs are elaborated in the book.
Exercise 3.31
The differential equation (3) corresponds to (1), which is the basic recurrence describing the number of comparisons used by quicksort. Multiplying (3) by (1−z)2 is equivalent to differencing the coefficients twice in succession. We get
This recurrence is describing probabilities of derangements. If we substitute an=dn/n!, simplify factorials, multiply through by n!, then we get MathWorld's equation (5) with the index shifted by 1 (see the previous link). The equivalent equation (6) can be solved using the method presented in Exercise 2.14. We would arrive at the same result, as shown below.
An easier path is to solve the transformed ODE, which resolves to a partial sum of a scaled elementary OGF.
A(z)A′(z)=1−zz∴A(z)=e−z⋅1−za0(A(0)=a0).
Expanding it gives
[zn]A(z)=an=a00≤k≤n∑k!(−1)k.
Exercise 3.33
The LHS is the convolution of two series (see Table 3.1):
By equating [zn] from both sides, we obtain the identity:
k≥0∑(−1)k(ks)(n−kr)=k≥0∑(−1)k(ks)(n−2kr−s).
What’s the benefit of this identity? Albeit both sides being semantically equivalent, the RHS effectively halves the computation time. As soon as n−2k<0 we must stop the summation of the RHS. On the LHS, the matching stoppage criteria is n−k<0. When r and/or s are integers then we have additional constraints on the range of k to avoid nonzero terms. At any rate, the aforementioned optimization is especially important in the case when r and s are real numbers, due to additional increase in accuracy.
The first term is a simple convolution of a left-shifted F(z) and a sequence with only two leading ones (the rest are all zero). The second term is an elementary OGF, denoting a sequence of all ones. Expanding these OGFs we get
Wikipedia contains a derivation of the sum expression for −[zn] (click the button on Wikipedia to reveal the proof). The generating function in this exercise is a negative of the one associated with Bernoulli numbers.
Exercise 3.37 🌟
We should use generating functions to find a sum expression for [zn]2−ez1. The idea is to employ substitution, as for Fibonacci numbers in the book, and afterward transform the substituted term with another power series. We have
To understand the last step, fix k=m≥n. The coefficient [zm] will "accumulate" contributions from all n≤m. Swapping the roles of n and k gives us the desired expression.
If we substitute z=y/(y−1), then we get
A(y)=1−y1B(y−1y),
which, by symmetry, shows the truthiness in the opposite direction.
Exercise 3.40
Wikipedia contains an analysis of the binomial transform including the sketch of the proof without using generating functions.
More formally, suppose we start with an arbitrary sequence {an}. We can generate another sequence {bn} from it
bn=k=0∑n(kn)(−1)kak⟺bn=(−1)nΔna0,
where Δ is the forward difference operator defined by Δan=an+1−an. The nth forward difference at index 0 is given by:
Δna0=k=0∑n(kn)(−1)n−kak.
Multiplying both sides by (−1)n yields the previously stated equivalence.
The transformation is symmetrical and an=∑k=0n(kn)(−1)kbk. We can see this by introducing the shift operator an+1=Ean=Ian+Δan=(I+Δ)an, where I is an identity operator. We want to express an in terms of applying the shift operator n times to a0. This gives
Observe, that IkΔn−k=Δn−k. This concludes the proof.
Exercise 3.41 🌟
Wikipedia contains the formal power series version of the Faà di Bruno’s formula, based on the application of the multinomial theorem. An important detail is the precondition g(0)=0, which makes the composition valid.
Exercise 3.42
Let’s follow the hint from the book and find a differential equation satisfied by the function
f(z)f′(z)=ez+z2/2=ez+z2/2(1+z)=f(z)(1+z).
Using Table 3.4, we can convert the above differential equation into recurrence relation on coefficients. This gives
The application of Taylor’s theorem establishes a one-to-one correspondence between coefficients fn and f[n](0), this is why coefficients satisfy the recurrence established by the previous differential equation. Notice, that we could have executed one more iteration by finding f′′(z), but it would give the same outcome as our shortcut.
Exercise 3.43
Let f(z)=∑nfnn!zn, thus f(z/2)=∑n2nfnn!zn. Then, provided that f(0)=1, we must have
Let’s dissect in detail the grouped section, to reveal the pattern. The leading term is similar to the example from the book, just with an opposite sign. This provides the first e−2z summand. The following steps are done from inside out
Continuing these iterations will result in a geometric progression that sums to e−2z. The constant terms annihilate with a remaining +1 at the end. This cancels the -1 in the original formula. As mentioned in the book, technically, we need to justify carrying out the iteration indefinitely, but the solution is easily verified from the original recurrence. Indeed, replacing f(z)=e2z verifies this statement.
Therefore, the recurrence relation and its solution are
fn=2n+k∑(−1)n−k(kn)2kfk,fn=2nfor n>0 with f0=1.
Observe that -1 (from the initial expression) disappears, since it doesn’t influence coefficients for n>0.
Exercise 3.44 🌟
We need to find an explicit formula for the OGF of the sequence satisfying the divide-and-conquer recurrence
f2nf2n+1=f2n−1+fn=f2nfor n≥1 with f0=0,for n>0 with f1=1.
This exercise showcases a more intricate functional equation, where interplay of two sub-functions define the base OGF. Since the recurrence relations distinguish between even and odd indices, let’s split f(z) accordingly
fo(z)=f1z+n≥1∑f2n+1z2n+1=z+n≥1∑f2nz2n+1=z+zn≥1∑f2nz2n=z+zn≥0∑f2nz2n∴fo(z)=z+zfe(z).(f1=1 and f2n+1=f2n)(f0=0 so we can start n at 0)
We can prove the correctness of the formula by induction on n. The base case is trivially satisfied; recall that an empty product equals one. The inductive step is based on the following expansion rule
f(1+(n−1)zz)=1+1+(n−1)zzf(1+nzz).
Therefore, the series in the nth iteration becomes
by the inductive hypothesism=0∑n−1∏k=0m−1(1+kz)zm+∏k=0n−2(1+kz)zn−1⋅1+(n−1)zz=m=0∑n∏k=0m−1(1+kz)zm.
Exercise 3.46 🌟
Given f(z) defined by the equation f(z)=z/(1−f(z2)) we have to find explicit expressions for a(z) and b(z) with f(z)=a(z)/b(z). We assume that f(z) is analytic at z=0, thus f(0)=a(0)=0 and b(0)=1 (actually, any scaling factor would work).
Substituting f(z)=a(z)/b(z) into the given equation, we get
b(z)a(z)=1−b(z2)a(z2)z=b(z2)−a(z2)zb(z2).
Comparing the numerators and denominators, we can set up the following system of equations:
a(z)b(z)=zb(z2)=b(z2)−a(z2)(1)(2).
Substituting (1) into (2), we get b(z)=b(z2)−z2b(z4). We can now solve for b(z). Let b(z)=∑n≥0bnzn, so by substituting this series into the recurrence, we get
n≥0∑bnzn=n≥0∑bnz2n−n≥0∑bnz4n+2.
Apparently, the RHS has only even powers, hence b2n+1=0 for n≥0.
If n is even, then bn=bn/2−b(n−2)/4. It helps to view n as a binary number, where division by 2 is right-shifting by one position. The next 4 cases illustrate what happens under the hood, as recursion unfolds until reaching the base case n=0. X2 denotes any binary prefix.
If n=X2010, then bn/2=0 (odd index) and b(n−2)/4=−bX20.
If n=X2000, then bn/2=bX200 and b(n−2)/4=0 (non-integer index).
If n=X2100, then bn/2=bX210 and b(n−2)/4=0 (non-integer index).
If n=X2110, then bn/2=0 (odd index) and b(n−2)/4=0 (odd index) .
Thus, by induction on bit position, bn is non-zero if and only if the binary expansion of n contains no consecutive 1s. Furthermore, advancing via bn/2 drops a trailing zero, whilst b(n−2)/4 removes a binary 1 and flips a sign. Thus,
b(z)=n∈S∑(−1)νnz2na(z)=n∈S∑(−1)νnz4n+1,
where S is a sequence registered under A003714 and νN is the population count (see Chapter 2 in the book). Observe that νN=ν2N.
Exercise 3.47
Only f(z)=0 satisfies that f(z)=sin(f(z)), where f(z)=∑n≥1fnzn.
Suppose, for the sake of contradiction, that there is also another power series g(z)=0 of the same form, satisfying the same condition. Let gk be the lowest indexed non-zero coefficient for k≥1. Using Taylor's expansion of the sine function, and passing g(z) as an argument, we have
sin(g(z))=g(z)−6g(z)3+120g(z)5−…
But this entails that
g(z)=g(z)−6g(z)3+120g(z)5−…
since g(z)=sin(g(z)) by definition. Subtracting g(z) from both sides, we get
0=−6g(z)3+120g(z)5−…
The lowest term in the RHS is −61gk3z3k, since the others have higher powers of z, so gk must be zero. But this contradicts our assumption that gk=0. Therefore, f(z)=0 is a unique solution.
Exercise 3.48
The text of this exercise in the book is imprecise. The paper Singularity Analysis Of Generating Functions* contains a section about 2-3 trees, where the matching functional equation represents the number of such trees with n external nodes.
Let f(z)=∑n≥0fnzn be the OGF for the number of 2–3 trees , where fn denotes the number of trees with n external nodes (leaves). Based on the functional equation f(z)=z+f(z2+z3), we see that f0=0 and f1=1. Notice, that the second summand generates higher-order terms, so for n>1 the first summand has no effect. We can substitute the form of OGF into the functional equation to get
To find the recurrence for fn, we equate coefficients of the same rank on both sides. This gives the following constraints on k, using the fact that 0≤j≤k, for which there is always some j such that 3k−j=n:
3k−j≤n⟹2k≤n⟹k≤⌊n/2⌋.
3k−j≥n⟹3k≥n⟹k≥⌈n/3⌉.
Thus, for any fixed n>1, we have
fn=k=⌈n/3⌉∑⌊n/2⌋(3k−nk)fk.
The next Python script computes the number of 2–3 trees of 100 external nodes.
It prints 5,520,498,313,790,316,062.
As a verification step, it is advisable to search for the sequence in the online encyclopedia. For instance, by displaying 20 values generated by this script (including the initial terms), we can confirm that it corresponds to an established sequence A014535.
Exercise 3.49 🌟
The book, at the time of this writing, has an error; t+1 must be replaced by t−1.
We show by induction on t that (1−z)tC(t)(z)=Ψ(Ψ+1)…(Ψ+t−1)C(z).
Observe, that Ψ can be manipulated as any other "symbol" in arithmetic expressions. For the sake of completeness, let’s show that (Ψ+m)(Ψ+n)F(z)=(Ψ+n)(Ψ+m)F(z) for any nonnegative integers n, m and arbitrary function F(z). We have
We now proceed with the inductive proof. The base case is trivially satisfied, since ΨC(z)≡(1−z)C′(z). For the inductive step, we have
(Ψ+t−1)((1−z)t−1C(t−1)(z))=Ψ((1−z)t−1C(t−1)(z))+(t−1)(1−z)t−1C(t−1)(z)=(1−z)(−(t−1)(1−z)t−2C(t−1)(z)+(1−z)t−1C(t)(z))+(t−1)(1−z)t−1C(t−1)(z)=−(t−1)(1−z)t−1C(t−1)(z)+(1−z)tC(t)(z)+(t−1)(1−z)t−1C(t−1)(z)∴(1−z)tC(t)(z)=(Ψ+t−1)by the inductive hypothesisΨ(Ψ+1)…(Ψ+t−2)C(z).
By successively swapping neighboring elements, we can move (Ψ+t−1) into the last position without disturbing the relative ordering of the other terms. This proves that the statement of this exercise holds for any t≥1.
Exercise 3.50
Sedgewick’s paper The Analysis of Quicksort Programs contains a full analysis of median-of-three quick-sort algorithm (see section 3.2). It derives the base formula in a more detailed manner than the book itself.
Exercise 3.51
Section 3.3 of the paper mentioned in the previous exercise shows the analysis of the median of 2t+1 quick-sort algorithm. Therefore, for a sample size of 5 we would use t=2.
Exercise 3.52 🌟
This exercise examines advanced calculus concepts and is indirectly related to the book, via linkage to generating functions. In order to encompass diverse perspectives, the answer is more focused on the methodology itself, whilst low-level details are skipped (like, step-by-step derivations of intermediate results). Otherwise, writing up everything would demand a separate chapter.
To make the exposition comprehensible as much as possible, new concepts are introduced only when necessary. In calculus, there are many generalizations whose usage may allow concise writeup. For example, we avoid using the differential operator, although it would make the text more formal and compact.
Unlike the classical Euler-Cauchy equation, where the power of the independent variable scales directly with the order of the derivative, this equation exhibits an inverse scaling relationship: as the order of the derivative j increases, the power of the coefficient (1−z)r−j decreases.
Leading-Coefficient Observation
Let’s expand the LHS
ar(z)f(r)(z)+ar−1(z)f(r−1)(z)+⋯+a0(z)f(z),
with aj(z)=(1−z)r−j. The coefficient of the highest derivative is
ar(z)=(1−z)r−r=(1−z)0=1,
so the leading coefficient is nonzero at z=1. Consequently, for the homogeneous equation the point z=1 is an ordinary point in the sense that the normalized coefficients aj(z)/ar(z) are analytic at z=1. This implies homogeneous solutions are analytic at z=1 and admit Taylor expansions about that point.
The inhomogeneous term (1−z)α may be non-analytic at z=1, so requires a careful analysis. Our focus will be on the Frobenius method. While it’s traditionally taught as a tool for regular singular points, its utility extends to constructing particular solutions for inhomogeneous equations, as in our case. We’ll showcase the Frobenius method in "inverted" regime.
The method of Frobenius can be viewed as a superset of the power series method. This is another reason why it’s a natural choice for this exercise related to a chapter about generating functions.
Change of Variable
Let x=1−z, such that
dzd=−dxd,dzjdj=(−1)jdxjdj,
and the equation becomes
j=0∑r(−1)jxr−jf(j)(x)=xα
for the inhomogeneous problem. Because the coefficient of f(r)(x) is (−1)rxr−r=(−1)r, a nonzero constant, the homogeneous equation has analytic coefficients at x=0. The above canonical transformation centers a potential singularity at the origin.
First-Order Case r=1
It always make sense to start with the simplest form to develop intuition and cement understanding of the task. The transformed equation is
with initial relations a2=0 and a3=a1/6. The recurrence links coefficients of the same parity, hence the solution space splits into an even {en} and odd {on} series. To peek into the derivation details, it’s instructive to read an analysis of the Hermite differential equation.
For finding the particular solution, we employ the Frobenius method. We start with the default ansatz
fP(x)=xkn≥0∑bnxn.
The lowest term is from fP′′(x), thus matching to xα yields k=α+2. This gives
fP(x)=n≥0∑bnxα+2+n.
Matching coefficients for power xα+n produces a recurrence of the form
If the denominator of any coefficient becomes zero, for instance, when α=−2, then we have to modify the ansatz. The Frobenius method mandates the inclusion of a logarithmic term
fP(x)=xα+2(Alnxn≥0∑bnxn+n≥0∑cnxn).
The logarithmic coefficients are determined by substituting this ansatz and solving the resulting linear system.
Again, the homogeneous part can be solved using OGFs. The recurrence relation becomes a difference equation of order r. Specifically, the coefficient of xn in the expansion involves terms from an+r down to an−r; xr−j is paired with dj/dxj, so the shift in power is always (r−j)−j=r−2j. Thus, the parity of the shifts (even vs odd) is fixed by the parity of r. There are two cases:
If r is even, then every shift is even. The recurrence links coefficients whose indices differ by even integers only, so the power‑series splits into two independent parity subspaces (even and odd). Each parity subspace supplies independent solutions. We have already seen example of this with r=2.
If r is odd, then every shift is odd. The recurrence links coefficients of opposite parity (indices differing by odd integers), so even and odd coefficients are coupled together. There is no decomposition into two independent parity subspaces; the recurrence mixes both parities and produces a single chain of coupled coefficients that yields the full set of r independent solutions.
The parity decomposition is a computational convenience (it halves the work) whenever r is even.
For finding the particular solution, we employ the Frobenius method, starting with the default ansatz. The highest-derivative term f(r) produces the lowest power, thus matching to xα gives
k−r=α⟹k=α+r.
Hence, the particular solution begins at xα+r
fP(x)=n≥0∑bnxα+r+n.
Of course, we must take care of exceptional cases, as in the situation with r=2.
We define T to be the set of all nonempty full binary trees, and adopt the notation ∣t∣ to represent, for t∈T, the number of external nodes in t. Then we have the following derivation:
The first line is an alternative way to express T(z) from its definition. Each tree with exactly k external nodes contributes exactly 1 to the coefficient of zk, so the coefficient of zk in the sum “counts” the number of trees with k external nodes. The second line follows from the recursive definition of full binary trees: either a binary tree has one external node (which accounts for the z), or it can be decomposed into two independent full binary trees whose external nodes comprise the external nodes of the original tree, without the root (internal node). The third line follows because the index variables tL and tR are independent.
Observe, in this case T(0)=0. It is always worthwhile to check our the math with a computer. The next Sage snippet correctly prints the first few known initial values and T6=42=61(510):
Exercise 3.55 🌟
The given OGF is a rational function
D(z)=(1−z)(1−z5)(1−z10)(1−z25)(1−z50)1.
The book contains a remark, that we can learn much about asymptotic behavior of coefficients by looking at singularities of the generating function. A kth root of unityω for k∈{1,5,10,25,50} makes the denominator zero. To understand the coefficients, we look at the partial fraction decomposition of D(z) and connect it to the previously mentioned singularities.
The value z=1 has order 5, as it makes any factor 1−zk zero in the denominator. The matching summand in the decomposition of D(z) would participate as C/(1−z)5, where C is some constant. Looking at Table 3.1, this would be represented in expanded form as ∑N≥0(NN+4)zN. Therefore, the coefficient of zN would grow as Θ(N4). Observe, that the other roots of unity are associated with lower multiplicities. Nevertheless, they do introduce cyclic fluctuations to the count, since they are periodic.
Combining the polynomial growth from z=1 and the periodicity from the other roots of unity, we get that the form of [zN]D(z) is a quasi-polynomial. If we denote by r≡N(mod50), where 50 is a least common multiple of the coin denominations, then we have
where the coefficients ai(r) depend on the remainder r.
Exercise 3.56 🌟
The next Python program uses the mathematical model from the previous exercise. The idea is to precompute coefficients for r∈[0,49] and afterward evaluate the quasi-polynomial for a given N. The implementation employs dynamic programming as well as fast linear equations solver from the Numpy package. The system of linear equations is represented as a Vandermonde matrix.
Running this program produces the following output:
LeetCode also offers this problem under Coin Change II, although covers only small cases.
Exercise 3.57
This is a simple variation of Polya's problem of changing a dollar. Instead of coins we have powers of 2 excluding 20. Let the index pi denote the number of 2i combinatorial object. This gives
Each configuration of powers of 2 that adds up to N clearly contributes exactly 1 to the coefficient of zN. The second line follows since the indices of summation are all independent. The last line is a translation based on Table 3.1 from the book.
Exercise 3.58 🌟
This exercise derives the foundational identity that 1−z and f(z)=∑nzn are inverses of each other. We can show this by treating f(z) as a formal power series
The base case t=1 is trivially satisfied. For t>1 we have
n=0∏t−1(1+z2n)=n=0∏t−2(1+z2n)(1+z2t−1)=n=0∑2t−1−1zn(1+z2t−1)=n=0∑2t−1−1zn+z2t−1n=0∑2t−1−1zn=n=0∑2t−1−1zn+n=2t−1∑2t−1zn=n=0∑2t−1zn.(by the inductive hypothesis)
When t→∞ then ∑nzn=∏n(1+z2n), so Euler's identity follows.
The identity reflects the fact that every non-negative integer has a unique binary representation. Expanding the product is equivalent to forming exponents by summing distinct powers of 2. Since every integer n can be written as a sum of powers of 2 in exactly one way, every term zn appears exactly once in the resulting series.
Exercise 3.59
In base 2, the digits are {0,1}, so the factors were (1+z2n)=(z0⋅2n+z1⋅2n).
In base 3, the digits are {0,1,2}. This means that for each position n (representing the value 3n), we need a term that offers the choice of adding 0⋅3n, 1⋅3n, or 2⋅3n to the exponent. Therefore, the generalized identity is
1−z1=n∏(1+z3n+z2⋅3n).
All the other details are analogous to the previous exercise.
Exercise 3.60
As in Exercise 3.58, expanding the product is equivalent to forming exponents by summing distinct powers of 2. But this time the expansion also happens with a sign flip. Consequently, adding each power of 2 is equivalent to increasing the number of ones in a binary representation of N by 1. We start with the factor 1−z, hence even number of ones are designated with +1, whilst odd number of ones with -1. This gives
Let X be the random variable that counts the number of leading zeros in a random binary string. For a string of size n with independent fair bits, the probability mass function is
The standard deviation is σX=var(X). When n→∞ then the mean is 1 and the standard deviation is 2.
Exercise 3.65
We must find the only missing piece pn′′(1). We have
pn′′(u)=n(n−1)(1+u)n−2⟹pn′′(1)=n(n−1)2n−2.
After substituting all known quantities into the formula from Table 3.5, we get that the variance for the number of 1 bits in a random binary string of length n is n/4.
Exercise 3.66
Let’s focus our attention on [zn], since we want to find the mean for the binomial distribution with size n. We know that [zn]qk(z)=(kn), so 0≤k≤n. Furthermore, [zn]P(z,1)=2n. We also have
[zn]rk(z)=0≤j≤k∑(jn).
Substituting all these into the formula for the cumulated cost gives
The double sum can be easily calculated by drawing a symmetric square matrix of size (n+1)×(n+1). The kth row should be filled with (kn). All columns sum to 2n and the main diagonal also sums to 2n. Our double sum is half the total sum of all entries. There is only one caveat; we must add once again the sum of entries at the main diagonal before dividing by 2.
Finally, the mean is simply the cumulated cost divided by 2n, which results in n/2.
Exercise 3.67
The book doesn’t provide details about derivation of the functional equation that must be satisfied by the BGF. This is important for understanding this and the next exercise. So, we’ll fill the gap here.
Derivation of the Functional Equation
We are given the recurrence for the PGF
QN(u)=N1k=1∑NuN+1Qk−1(u)QN−k(u),
that is defining the BGF
Q(z,u)=N≥0∑QN(u)zN.
Notice that Q(z,1) represents the sum of probabilities for each N, so it must designate a sequence of all ones, hence Q(z,1)=1/(1−z).
Let’s differentiate this BGF with respect to z and substitute the recurrence of QN(u) into it. We use the compact notation for partial derivatives, as introduced in the book.
Qz(z,u)=N≥1∑QN(u)NzN−1=N≥1∑(k=1∑NuN+1Qk−1(u)QN−k(u))zN−1=u2N≥1∑(k=1∑NQk−1(u)QN−k(u))(uz)N−1=u2N≥1∑(k=0∑N−1Qk(u)QN−1−k(u))(uz)N−1=u2N≥0∑(k=0∑NQk(u)QN−k(u))(uz)N=u2Q2(zu,u)(self-convolution of Q(zu, u)).
The correct initial condition is Q(0,u)=Q0(u)=1, which tells that sorting an empty array has zero cost with absolute certainty (probability is 1). The book shows the arguments in permuted order.
Solution of the Exercise
We know that QN(1)=1, since QN(u) is a PGF. This gives
Notice that QN′(1) satisfies the standard quicksort recurrence that the book addressed in section 3.3, so it must have the same solution. Consequently,
To verify the expression for q[2](z) we must differentiate the functional equation that Q(z,u) must satisfy twice with respect to u. To make the exposition manageable, due to convoluted terms, let’s introduce the following substitutions:
A(z)=Q(z,1) implying that A′(z)=A2(z).
x≡uz, y≡u and B(x,y)=Q(x,y) implying that Bu=zBx+By. This follows from applying the multivariable chain rule.
Let’s differentiate Qz=u2B2 once w.r.t. u to get
Qzu=2uB2+2u2BBu.
Repeating the previous step once more we arrive at
The following list shows the necessary coefficient extraction toolbox. The first identity is a left-shifted elementary OGF from Table 3.1. The last two items are derived by differentiating (1−z)−α once/twice w.r.t. α (see Exercise 3.28).
Now it’s a matter of getting coefficients for each summand and consolidating the result. Calculating the variance also requires some additional algebra. Recall that it is defined as
var(CN)=[zN](q[2](z)+q[1](z))−([zN]q[1](z))2.
Using some computer algebra tool we can confirm the exact expression for the variance given in Theorem 3.9.
Exercise 3.69
n,k≥0∑(kn)ukn!zn=n≥0∑(k≥0∑(kn)uk)n!zn=n≥0∑(1+u)nn!zn=n≥0∑n!((1+u)z)n=e(1+u)z(by Taylor expansion of ex).
Since g(z) is even, its series expansion contains only even powers of z. Therefore, Bk is zero for k odd, k≥3. This concludes the proof.
Exercise 3.75
This exercise is a simple generalization of the DGF for even numbers. Observe that a positive integer N ending in k≥0 zeros has the form N=2kM, where M≥1. This gives
N≥1ends in k 0s∑Nz1=N≥1∑(2kN)z1=2kz1ζ(z).
Exercise 3.76
The previous exercise already provides much of the answer, we just need to sum up DGFs for all k≥1, since absence of a kernel component entails a count of zero (odd number). This gives
# f_sq=f(z^2) is treated as a constant symbol.
var('z f_e f_o f f_sq')
eq1 = f == f_e + f_o
eq2 = f_o == z * (1 + f_e)
eq3 = f_e == z * f_o + f_sq
solutions = solve([eq1, eq2, eq3], f_e, f_o, f)
f_solution = solutions[0][2].rhs()
show(f_solution)
from math import ceil, comb
f = [0 for _ in range(101)]
f[1] = 1
for n in range(2, 101):
f[n] = sum(comb(k, 3 * k - n) * f[k] for k in range(ceil(n / 3), n // 2 + 1))
print(f[100])
import numpy as np
class ChangeDollar:
def __init__(self):
self.coins = [1, 5, 10, 25, 50]
self._period = 50
self._degree = 4
self._coeffs = self._precompute_coefficients()
def _solve(self, limit):
dp = [0] * (limit + 1)
dp[0] = 1
for coin in self.coins:
for i in range(coin, limit + 1):
dp[i] += dp[i - coin]
return dp
def _precompute_coefficients(self):
# To find 5 coefficients for a degree 4 poly, we need 5 points for each r.
limit = self._period * self._degree + (self._period - 1)
dp = self._solve(limit)
coeffs = {}
for r in range(self._period):
# Points for interpolation: N = r, r+50, r+100, r+150, r+200
x = np.array([r + self._period * i for i in range(self._degree + 1)], dtype=float)
y = np.array([dp[int(val)] for val in x], dtype=float)
# Solve the system of linear equations (Vandermonde matrix)
# a4*x^4 + a3*x^3 + a2*x^2 + a1*x + a0 = y
linear_system = np.vander(x, self._degree + 1)
coeffs[r] = np.linalg.solve(linear_system, y)
return coeffs
def solve(self, n):
if n < 0:
return 0
r = n % self._period
result = np.polyval(self._coeffs[r], n)
return int(round(result))
solver = ChangeDollar()
for n in (100, 200, 3731, 10**12):
print(f"Number of ways to change {n} cents: {solver.solve(n)}")
Number of ways to change 100 cents: 292
Number of ways to change 200 cents: 2435
Number of ways to change 3731 cents: 135730300
Number of ways to change 1000000000000 cents: 666666666793332747200572713557045308030976