This exercise illustrates the symbolic method and its efficacy in simplifying the analysis of combinatorial structures. A formal language is employed to define the combinatorial class, which corresponds to an associated generating function. Extracting coefficients then involves identifying the relevant values pertaining to the object under consideration.
Consider the class G of binary strings having no three consecutive 0 bits. Such strings are either ∊, a single 0, a double 00, or 1 or 01 or 001 followed by a string with no three consecutive 0 bits. Symbolically, this argument gives the combinatorial construction
G=∊+Z0+Z0×Z0+(Z1+Z0×Z1+Z0×Z0×Z1)×G.
Theorem 5.1 allows us to translate this immediately into a formula for the generating function G(z) that enumerates such strings:
G(z)=1+z+z2+(z+z2+z3)G(z)∴G(z)=1−z−z2−z31+z+z2.
The numerator is consisted of shift factors, as in the example from the book. The denominator is a well-known polynomial describing Tribonacci numbers (see Exercise 4.18). Therefore, the number of strings of length N with no three consecutive 0 bits is TN+TN+1+TN+2=TN+3, a sequence registered under A000073.
Exercise 5.2
We can express our class G symbolically as
G=SEQ(Z1)×SEQ(Z0).
This gives
G(z)=(1−z)21∴[zN]G(z)=N+1.
Exercise 5.3
This is a simple variation of the example from the book. The class U matches the construction
U=Z□+Z■×U×U.
We translate both Z classes as z to get
U(z)=z+zU(z)2∴U(z)=2z1−1−4z2.
It’s always wise to check our OGF. For example, WolframAlpha can expand the generating function with series \frac{1-\sqrt{1-4z^2}}{2z} to order 9.
Exercise 5.4
The total number of binary trees with N nodes is given by the Nth Catalan number. We need to figure out the number of binary trees with N nodes without superleaves. The ratio of these numbers specifies the required fraction. The recursive decomposition of our combinatorial structure is
T=Z□+Z■×T×T−Z■×Z■×Z■.
Notice that we must exclude a symmetric tree with 3 internal nodes that produces a superleaf at the root. This translates to
T■(z)=1+zT■(z)2−z3∴T■(z)=2z1−1−4z+4z4.
With the help of WolframAlpha we can confirm that the coefficients divided by the Catalan numbers produce the desired ratios. There is only one caveat, that trees with less than 3 nodes never produce a superleaf, so instead of zero their ratios should give one.
Exercise 5.5 🌟
This exercise introduces a new operation set of, where B is defined as the collection of all finite subsets of A.
We can regard B as a series of independent decisions. The combinatorial construction for a decision D about any single object of size n in A is
D=ϵ+Zn⟹D(z)=1+zn.
In other words, we either skip it or include it in our collection. Now, if we have An distinct objects of size n in A, then we make such decisions for every object. The construction is simply a sequence of An decisions that translates to (1+zn)An. Finally, the whole product comes from repeating this endeavor for every possible size n.
The second part of the proof follows from turning a product into a sum by taking a logarithm and employing Taylor expansion of ln(1+x). This gives
B(z)=n≥1∏(1+zn)An=exp{n≥1∑Anln(1+zn)}=exp{n≥1∑An(zn−z2n/2+z3n/3−⋯)}(see section 4.2 in the book)=exp{n≥1∑Anzn−21n≥1∑An(z2)n+31n≥1∑An(z3)n−⋯}=exp{A(z)−21A(z2)+31A(z3)−⋯}.
Exercise 5.6 🌟
This exercise introduces a new operation multiset of, where B is defined as the collection of all finite multisets of A (subsets with repetitions allowed).
We can regard B as a series of independent decisions. The combinatorial construction for a decision D about any single object of size n in A is
In other words, we either skip it, include it once, or twice, or any number of times (infinite geometric series). Now, if we have An distinct objects of size n in A, then we make such decisions for every object. The construction is simply a sequence of An decisions that translates to 1/(1−zn)An. Finally, the whole product comes from repeating this endeavor for every possible size n.
For the second part of the proof, we follow the same approach as in the previous exercise
B(z)=n≥1∏(1−zn)An1=exp{−n≥1∑Anln(1−zn)}(ln1=0)=exp{−n≥1∑An(−zn−z2n/2−z3n/3−⋯)}(see section 4.2 in the book)=exp{n≥1∑Anzn+21n≥1∑An(z2)n+31n≥1∑An(z3)n+⋯}=exp{A(z)+21A(z2)+31A(z3)+⋯}.
Exercise 5.7
This is a variation of the generalized derangements problem. The combinatorial construction is
PODD∗=SET(CYC1(Z)+CYC3(Z)+CYC5(Z)+⋯).
The EGF associated with the combinatorial class PODD∗ is
The second summand is an OGF for FN+2. The first summand can be rewritten as
(1−z−z2)21+z=z21+z⋅(1−z−z2z⋅1−z−z2z).
The OGF inside parenthesis is a self-convolution of Fibonacci numbers (there is a closed-form solution). Hence, after combining all components and simplifying, we get that the average number of 1 bits in a random bitstring of length N having no 00 is
FN+2[zN]Gu(z,1)=5FN+24NFN+2−(N+2)FN.
Exercise 5.14
The book contains the formula for the number of derangements DN. We can tweak the combinatorial construction, by marking cycles with u, to derive the cumulative generating function
Informally, we already know from the book that in a random permutation, the average number of cycles is HN. Let X be a random variable counting the number of fixed points (unary cycles) in a random permutation with N elements. We have X=∑kXk, where Xk is an indicator random variable telling whether the kth element remains on its current position. Apparently,
Pr{Xk=1}=1/N⟹E[X]=k∑Pr{Xk=1}=1.
Therefore, the average number of cycles in a random derangement for large N is ∼HN−1. Of course, this back-of-the-envelope calculation makes some unjustified assumptions about independence of events, but gives an indication about what happens at the limit.
Exercise 5.15 🌟
This exercise illustrates the symbolic method for parameters in great detail.
The total number of binary trees with N nodes is given by the Nth Catalan number. The recursive decomposition of our combinatorial structure is
T=ϵ+Z■+2Z■×(T−ϵ)+Z■×(T−ϵ)2.
In plain English, a tree is one of the following:
An empty tree (external node).
A leaf.
A root with one non-empty subtree (either left or right).
A root with two non-empty subtrees; we must mark this root.
Finding a partial derivative directly is quite complicated, even with a help of a computer algebra system. Instead, we will use the functional equation. By differentiating it implicitly, we get
Recall that T(z)=T(z,1)=1+zT(z,1)2, a standard equation for binary trees. This enables the last simplification. Extracting the coefficient is trivial by employing equation (5.72) from the book Concrete Mathematics by Graham, Knuth and Patashnik
[zN]Tu(z,1)=(N−32N−2).
The aforementioned book is an indispensable source when dealing with convoluted equations, as it contains numerous identities, like the one we have just used.
The average number of internal nodes in a binary tree of size N with both children internal is
[zN]T(z,1)[zN]Tu(z,1)=2(2N−1)(N−1)(N−2).
The formula checks perfectly for trees in Figure 5.2.
Exercise 5.16
This is a variation of the previous exercise. The only difference is that we must mark the internal node connected to a single non-empty subtree. This gives
The root with the smallest modulus is 1/ϕ, so β=ϕ. It is a simple algebra to show that
[zN]g(z)f(z)∼5ϕN+2.
The book incorrectly has N in the exponent instead of N+2.
Exercise 5.20
According to Exercise 2A(z)=(1−z)21. Here, β=1, ν=2, f(1)=1 and g′′(1)=2. Based on Theorem 4.1, an approximation for the number of bitstrings having no occurrence of 01 is
[zN]g(z)f(z)∼N.
Exercise 5.21 🌟
This exercise showcases the methodology of using the radius-of-convergence bounds.
Let f(z)=exp(1−zz). For the power series expansion of the exponent to converge, we must restrict our domain to ∣z∣<1 (the criterion for a geometric series to converge). Therefore,
[zn]f(z)≤x∈(0,1)minxnexp(1−xx).
We should set x to balance the growth of the numerator and the denominator as n grows. Since we must prove that the bound is a function of n, let’s try
x=1−n1.
This is a bit reminiscent of the trial and error process in finding particular solutions of nonhomogeneous differential equations.
We have
exp(1−xx)=exp(n11−n1)=exp(n−1),
and
xn1=(1−n1)−n=(1−n1)n(−n)≤exp(n+1).
See Exercise 4.39 to understand the last step. All in all, this concludes the proof
[zn]f(z)≤exp(n−1)⋅exp(n+1)=exp(2n).
We aren’t claiming that C=2 is optimal, just that there is such a constant.
Exercise 5.22
This is a variation of the previous exercise. Let f(z)=∏k=1∞(1−zk)−1. The radius of convergence is ∣z∣<1. To deal with the infinite product, we take the natural logarithm and use the Taylor series expansion ln(1−x)=−∑j=1∞jxj. This gives
lnf(x)=−k=1∑∞ln(1−xk)=k=1∑∞j=1∑∞jxjk=j=1∑∞j1k=1∑∞(xj)k=j=1∑∞j11−xjxj.(by absolute convergence of sums)
To bound this, we can use a standard algebraic identity (recall that 0<x<1)
Theorem 5.5 immediately gives the result with α=1/2
[zN]PODD∗(z)∼πN2=πN/21.
Exercise 5.24
For the special case f(z)≡1, the extension of Theorem 5.5 to three terms is available on Analytic Combinatorics by RS and PF book’s website (see slide 14). We will reuse this asymptotic series for deriving a more precise version of Theorem 5.5 below (when handling the term (1−z)−(α−k)).
The asymptotic expansion follows from expanding f(z) in a Taylor series about z=1
We have split the sum into two parts around the threshold value k0; in spirit, aligned with the Laplace method for sums. A good choice is k0=⌊n⌋; the head of the sum captures all of f(1), where the denominators tend to one.
Assuming that f(z) converges in a radius r>1, all terms indexed larger than k0 tend to zero exponentially fast (recall that [zn]f(z)=O(r−n)). Therefore, the tail sum is negligible.
Exercise 5.26
This is a variation of the previous exercise. From Table 3.1 we know that
[zn]1−z1ln1−z1=Hn.
We setup a convolution, just like in the proof of Theorem 5.5 from the book
[zn]f(z)(1−z1ln1−z1)=k=0∑nfkHn−k.
We split the sum in the same way as previously, so the leading part is
k=0∑k0fkHn−k∼Hnk=0∑k0fk∼f(1)lnn.
This is true, since Hn−k∼Hn∼lnn for large n and k≤k0=n. The tail sum is negligible for the same reason as in the previous exercise. Therefore,