book-sectionChapter 6

Exercise 6.1

The inductive hypothesis is the statement of the lemma. The base case is a tree with one external node and the lemma certainly holds.

For the inductive step, assume a tree has i>0i>0 internal nodes. We can take the left and right subtrees of the root, both of them having less than ii internal nodes. Let their numbers be iLi_L and iRi_R, respectively. For these subtrees, the inductive hypothesis holds, so eL=iL+1e_L=i_L+1 and eR=iR+1e_R=i_R+1. Furthermore, eL+eR=iL+1+iR+1=i1+2=i+1e_L+e_R=i_L+1+i_R+1=i-1+2=i+1 as expected, since iL+iR=i1i_L+i_R=i-1 (the root is an internal node) and eL+eR=ee_L+e_R=e. This concludes the inductive proof.

Exercise 6.2

The total number of binary trees with NN nodes is given by the NNth Catalan number TNT_N. We need to find the number of binary trees with NN internal nodes having both subtrees of the root nonempty. The ratio of these numbers gives the answer to this exercise. Thus,

TN2TN1TN=N22N1 for N>2.\frac{T_N-2T_{N-1}}{T_N}=\frac{N-2}{2N-1} \quad \text{ for $N>2$}.

For trees with less than 3 internal nodes this ratio is zero.

Exercise 6.3

The answer is TN2/T2N+1T_N^2/T_{2N+1}, where TNT_N is the NNth Catalan number.

Exercise 6.4

Obviously, it is

GN1GN=TN2TN1=N(N1)(2N2)(2N3)for N>1.\frac{G_{N-1}}{G_N}=\frac{T_{N-2}}{T_{N-1}}=\frac{N(N-1)}{(2N-2)(2N-3)} \qquad \text{for $N>1$}.
circle-info

See the next exercise for a generalization of this problem.

Exercise 6.5 🌟

circle-check

For answering this question, we will switch to the symbolic method. We already know the symbolic construction for a general tree from the proof of Theorem 6.2. We want to find the trees where the root has exactly tt children. Symbolically, a tree in this class is a root node attached to a sequence of exactly length tt of G\mathcal{G}

A=Z×SEQ=t(G)=Z×GtA(z)=zG(z)t=zt+1F(z)t.\mathcal{A} = Z \times \text{SEQ}_{=t}(\mathcal{G}) = Z \times \mathcal{G}^t \\ \therefore A(z)=zG(z)^t=z^{t+1}F(z)^t.

We want the number of such trees of size NN

[zN]A(z)=[zNt1]F(z)t=t2Nt2(2Nt2Nt1).[z^N] A(z) = [z^{N-t-1}]F(z)^t=\frac{t}{2N-t-2}\binom{2N-t-2}{N-t-1}.

The RHS immediately follows from the Catalan kk-fold convolutionarrow-up-right. The asked proportion is

t2Nt2(2Nt2Nt1)GN=tN(N1)(N2)(Nt)(2N2)(2N3)(2Nt2)for N>t.\frac{\cfrac{t}{2N-t-2}\dbinom{2N-t-2}{N-t-1}}{G_N}= \frac{t N \cdot (N-1)(N-2)\cdots(N-t)}{(2N-2)(2N-3)\cdots(2N-t-2)} \qquad \text{for $N>t$}.
circle-info

As a quick check, when t=1t=1 we get exactly the formula from the previous exercise.

Exercise 6.6 🌟

circle-check

To count the number of forests without singletons, we should alter a bit the combinatorial construction for the forest

F+=SEQ(Z×(FE)G+).\mathcal{F}^+ = \text{SEQ}(\underbrace{Z \times (\mathcal{F}-E)}_{\mathcal{G}^+}).

The ordinary forest is a sequence of trees, but our restricted definition only allows trees with at least two nodes. Using the corresponding transfer theorems, we have

F+(z)=11z(F(z)1)=11z2F(z)2=11G(z)2=1(1G(z))(1+G(z))=F(z)1+G(z)=F(z)1+F(z)1F(z)(G(z)=zF(z)=zF2(z)/F(z))=F(z)22F(z)1=F(z)2(z+2)(2F(z)1)(z+2)=zF(z)2+2F(z)2(2F(z)1)(z+2)=F(z)1+2F(z)2(2F(z)1)(z+2)=(2F(z)1)(F(z)+1)(2F(z)1)(z+2)=F(z)+1z+2(z+2)F+(z)=1+F(z).\begin{align*} F^+(z) &= \frac{1}{1-z(F(z)-1)} \\[0.3cm] &= \frac{1}{1-z^2F(z)^2} \\[0.3cm] &= \frac{1}{1-G(z)^2} \\[0.3cm] &= \frac{1}{(1-G(z))(1+G(z))} \\[0.3cm] &= \frac{F(z)}{1+G(z)} \\[0.4cm] &= \frac{F(z)}{1+\frac{F(z)-1}{F(z)}} \qquad \text{($G(z)=zF(z)=zF^2(z)/F(z)$)} \\[0.5cm] &= \frac{F(z)^2}{2F(z) - 1} \\[0.4cm] &= \frac{F(z)^2 \cdot (z + 2)}{(2F(z) - 1) \cdot (z + 2)} \\[0.4cm] &= \frac{zF(z)^2 + 2F(z)^2}{(2F(z) - 1)(z + 2)} \\[0.4cm] &= \frac{F(z) - 1 + 2F(z)^2}{(2F(z) - 1)(z + 2)} \\[0.4cm] &= \frac{(2F(z) - 1)(F(z) + 1)}{(2F(z) - 1)(z + 2)} \\[0.4cm] &= \frac{F(z) + 1}{z + 2} \\[0.3cm] &\therefore \boxed{(z+2)F^+(z) = 1 + F(z)}. \end{align*}

If we extract the coefficient of zNz^N for N1N \ge 1, we get a recurrence relation

FN1++2FN+=FN with F0+=1.F^+_{N-1} + 2F^+_N = F_N \quad \text{ with $F^+_0=1$}.

We can solve this first-order linear recurrence using the generalized technique presented in Exercise 2.14. This gives our proportion (after dividing by FN=TN)F_N=T_N)

PN=1TN[(12)N+k=1N(1)Nk2Nk+1Tk].P_N = \frac{1}{T_N} \left[ \left(-\frac{1}{2}\right)^N + \sum_{k=1}^N \frac{(-1)^{N-k}}{2^{N-k+1}} T_k\right].

It prints 0, 1, 2, 6. Notice that 6/14=3/7, as expected.

Exercise 6.7

We first group forests by size (the order of list items designate NN) and then enumerate each group in top-down left-right fashion. This gives

  1. ()

  2. ()(), (())

  3. ()()(), (())(), ()(()), (()()), ((()))

  4. ()()()(), (())()(), ()(())(), ()()(()), ((()))(), (()())(), (())(()), ()(()()), ()((())), (()()()), ((())()), ((()())), (()(())), (((())))

Exercise 6.8

A notable distinction between ordered trees and binary trees is that a binary tree may be empty. Accordingly, an explicit notation should be used to indicate an empty binary tree. The diagram below provides a reference to highlight the differences in their respective representations.

Illustration of the rotation correspondence between an ordered and binary tree. External nodes are omitted on the right.

Preorder traversal of a tree maps directly to a preorder traversal of a binary tree. Postorder traversal of a tree corresponds to an inorder (not postorder) traversal of a binary tree. There is no notion of an inorder traversal of a tree, since it’s not defined when to visit the root among multiple children.

Parenthesis Systems Representation

The preorder representation of a tree is

(1 (2 (5) (6) ) (3) (4) )

The preorder representation of a binary tree is

(1 (2 (5) (6) ) (3) (4) )

circle-info

In an arbitrary binary tree, it is necessary to append an additional closing parenthesis at the end to indicate the last external node. Since a binary tree is a positional structure, each external node (empty link) is denoted by a closing parenthesis. Consider node 5, which has node 6 as its right child; during traversal of 5's children, a closing parenthesis is placed for the left child and an opening parenthesis for the right child. However, the root of a binary tree produced through rotation correspondence from an ordered tree doesn’t possess a right child in this context.

The postorder representation of a tree is

( ( (5) (6) 2) (3) (4) 1)

The inorder representation of a binary tree is

( ( (5) (6) 2) (3) (4) 1)

The postorder representation of a binary tree is a bit awkward to be represented in this manner. As highlighted in the book, when we know additional properties of nodes (like number of arguments of an operator), then we can circumvent the usage of parenthesis in a preorder/postorder representation of any type of tree. Inorder representation must encode structural information of a binary tree, since ambiguities may arise.

Preorder/Postorder Degree Representation

A tree can also be represented by listing the degree (number of children) of each node in a specific traversal order. In a binary tree, standard "degrees" don't apply the same way. Instead, the degree sequence of a tree dictates the shape of the right-leaning spines in the matching binary tree. More specifically,

  • If a node XX in a tree has a degree of kk, it means XX has kk children.

  • In a binary tree, this translates to: XX's left child exists, and from that left child, there is a continuous chain of right children of length k1k - 1. For example, node 1 of a tree in our diagram has 3 children, thus node 2 of a binary tree has a right-leaning spine of length 2.

  • A degree of 0 in a tree means the corresponding node in a binary tree has no left child.

Exercise 6.9

triangle-exclamation

Exercise 6.10 🌟

circle-check

Let’s rotate any lattice-path from Figure 6.6 by 45°45^\degree counterclockwise to get a graph of the winner’s margin as the ballots are counted. This perspective is more natural for understanding the dynamics behind the scene. We can represent ups and downs as follows:

  • Reading a 0 is a step "Up-Right" (+1,+1)(+1, +1).

  • Reading a 1 is a step "Down-Right" (+1,1)(+1, -1).

Every path starts at A(0,0)A(0,0) and lands at T(N,2kN)T(N,2k-N), assuming kk zeros in a path. For being a good path, it must never drop below the xx axis, i.e., touch the y=1y=-1 line. Consequently, the necessary condition is 2kN0    kN/22k-N \ge 0 \implies k \ge \lceil N/2 \rceil. In probability, it’s sometimes easier to count the complementary events. In our case, we have that the number of good paths is equal to the difference between total number of paths and bad ones. The former is easy to express as (Nk)\binom{N}{k}. For figuring out the number of bad paths, we apply the previously mentioned reflection method.

If a path is bad, it hits the line y=1y = -1 at some point P(M,1)P(M,-1). If we take the portion of the path before that first touch point and reflect it across the line y=1y = -1, the path will now appear as if it had one more zeros then ones. Recall that a reflected path climbs up from A(0,2)A'(0,-2) to PP and does the opposite of what A(0,0)A(0,0) to PP does. The key insight with the reflection method is that we can establish a bijection between bad paths from A(0,0)A(0,0) to TT via PP and paths fromA(0,2)A'(0,-2) to TT via PP. Because for every possible way for a path to hit PP from AA, we have a unique path to do the same from AA'. Notice also that segments after PP overlap between these paths. Therefore, the number of bad paths from AA is equal to the number of all paths from AA'. The latter is now easy to express as (Nk+1)\binom{N}{k+1}.

To find the total number of valid strings of length NN, we just add up these valid paths for all allowable values of kk

k=N/2N[(Nk)(Nk+1)]=(NN/2)=(NN/2).\sum_{k = \lceil N/2 \rceil}^{N} \left[ \binom{N}{k} - \binom{N}{k+1} \right]=\binom{N}{\lceil N/2 \rceil}=\binom{N}{\lfloor N/2 \rfloor}.

The sum nicely telescopes and only the first term "survives" cancellations. Remember that (NN+1)=0\binom{N}{N+1}=0, hence the last term doesn’t matter.

Exercise 6.11

There is a bijection between the parenthesis representation of an ordered forest to the plus-minus representation of its associated binary tree. We should swap symbols: ( +( \space + and ) ) \space -. There is only one caveat, that there is an extra - at the end of the representation of a binary tree to encode the last external node. This difference should be taken into account when converting from one form into another.

Exercise 6.12

This topic is known as treemappingarrow-up-right and many algorithms exist that produce outputs of various quality. Aspect ratio plays a central role and a particularly popular variant is the squarified treemaparrow-up-right. It attains α\alpha close to one.

As a side note, leveraging the ISO paper standard’s constant width-to-height aspect ratio of α=2\alpha=\sqrt 2, we can develop a simple algorithm to represent binary trees with nested rectangles. This value of α\alpha ensures that any resulting smaller sub-rectangles will retain the exact same aspect ratio. Now, the algorithm would traverse in preorder the binary tree and for every internal node split the corresponding nested rectangle along its wider side into two; here, lines denote internal nodes. An extension to store ordered trees would be to first convert them into binary trees using rotation correspondence.

Exercise 6.13

The paper Rotational tree structures on binary trees and triangulationsarrow-up-right explains in detail the rotational relationships among the NN trees corresponding to the NN rotations of an asymmetric triangulation. An advantage is that it employs the identical hexagons illustrated in Figure 6.8 of the book.

Exercise 6.14

triangle-exclamation

Exercise 6.15

Let tit_i denote the iith child of tt, if any. Recall that a general tree cannot be empty. The recursive definitions are as follows: pl(t)=t1+ipl(ti)h(t)=[t>1]+maxih(ti).pl(t)=|t|-1+\sum_ipl(t_i) \\ h(t)=[|t|>1]+\max_i\,\,h(t_i).

Notice the usage of the Iverson bracketarrow-up-right in the definition of height.

Exercise 6.16

Let l(t)l(t) denote the number of leaves in t. For a binary tree, if tt is an external node, then l(t)=0l(t)=0. The number of leaves in binary and general trees, respectively, are as follows: l(t)=[t=1]+l(tl)+l(tr)l(t)=[t=1]+il(ti).l(t)=[|t|=1]+l(t_l)+l(t_r) \\ l(t)=[|t|=1]+\sum_i l(t_i).

See the previous exercise for notational conventions.

Exercise 6.17

We prove this by induction on NN (number of external nodes) using a bit stronger hypothesis h(t)lgNlgNh(t) \ge \lceil\lg N\rceil\ge \lg N, because the height is a positive integer. The base case (N=1N=1) is trivially satisfied, since h(t)=0lg1=0h(t)=0 \ge \lg 1=0.

For the inductive step N>1N>1, we have

h(t)=1+max(h(tl),h(tr))1+lgN/2(by the inductive hypothesis)1+lgN/2=1+lgN1=lgN.\begin{align*} h(t) &= 1+\max(h(t_l), h(t_r)) \\ &\ge 1+\lceil\lg \lceil N/2\rceil \rceil && \text{(by the inductive hypothesis)} \\ &\ge 1+\lceil\lg {N/2} \rceil \\ &=1+\lceil\lg N\rceil-1 \\ &=\lceil\lg N\rceil. \end{align*}

The derivation relies on the fact that N=Nl+NrN=N_l+N_r, since the root is an internal node. Furthermore, one of the subtrees must have at least half of the nodes, while the other one must have at least one external node. Therefore, we can apply the inductive hypothesis on a larger subtree.

Exercise 6.18

The Kraft–McMillan inequalityarrow-up-right is a generalization of this exercise. The referenced Wikipedia article provides a comprehensive proof, referring to leaves rather than external nodes. The stated equality applies to binary trees—defined, as in the book, as full binary trees—within the context of complete codes. See also Exercise 6.37 for more in depth details.

Exercise 6.19

The bounds are N1pl(t)N(N1)/2N-1\le pl(t) \le N(N-1)/2 . The lower bound is achieved by a star tree (root with N1N-1 children), whilst the upper bound applies for a chain of nodes.

Exercise 6.20

Assume N>0N>0. By the Lemma from the book, we have xpl(t)=ipl(t)+2Nxpl(t) = ipl(t) + 2N. Therefore, bounds on the internal path length determines those on the external. From the previous exercise, we already know that

ipl(t)N(N1)/2    xpl(t)N(N+3)/2.ipl(t) \le N(N-1)/2 \implies xpl(t)\le N(N+3)/2.

The lower bound is attained by the most balanced tree (complete binary tree), where height is minimized. Consequently, h(t)=lg(N+1)h(t)=\lceil \lg (N+1) \rceil with 1r2h(t)11 \le r \le 2^{h(t)-1} leaves at the last level. This gives

ipl(t)=k=1h(t)2k2k+(h(t)1)r=(h(t)3)2h(t)1+2+(h(t)1)r=(h(t)3)2h(t)1+2+(h(t)1)(N2h(t)1+1)=(h(t)1)N2h(t)+h(t)+1=NlgN+O(N).\begin{align*} ipl(t) &= \sum_{k=1}^{h(t)-2}k2^k+(h(t)-1)r \\ &= (h(t)-3)2^{h(t)-1}+2+(h(t)-1)r \\ &= (h(t)-3)2^{h(t)-1}+2+(h(t)-1)(N-2^{h(t)-1}+1) \\ &= (h(t)-1)N - 2^{h(t)} + h(t) + 1 \\ &= N\lg N+O(N). \end{align*}

Therefore, xpl(t)=NlgN+O(N).xpl(t)=N \lg N + O(N).

Exercise 6.21

Assume that the number of internal nodes is N>0N>0. The number of leaves satisfies

1l(t)N2.1 \le l(t) \le \left\lceil \frac{N}{2} \right\rceil.

The lower bound happens for the most unbalanced situation (chain of nodes). The upper bound is attained by the most balanced tree (complete binary tree). In a perfect binary tree (where all leaves are at the last level), their number is exactly (N+1)/2=N/2(N+1)/2=\lceil N/2 \rceil. Any other arrangement results in a lower number, since we need to remove two leaves at the last level, to get an additional one on the level above.

Exercise 6.22

We can accomplish this with 2 registers as follows:

Observe that this is an optimal number, since we cannot do it with only a single register.

Exercise 6.23

The traversals are enrolled below from left to right:

  • preorder: *+abd inorder: (a+b)*d postorder: ab+d*

  • preorder: -**+ab-de+fg*hi inorder: (((a+b)*(d-e))*(f+g))-(h*i) postorder: ab+de-*fg+*hi*-

Exercise 6.24

The tree denoting the expression ((a2+b+c)(d4e2)(f+g+h))ij((a^2 + b + c) ⋆ (d^4 – e^2) ⋆ (f + g + h)) – i ⋆ j.

The unadorned traversals of the tree are:

  • preorder: -*+*aabc-*dddd*ee+fgh*ij

  • postorder: aa*bc+dddd*ee*-fgh+*ij*-

These only work unambiguously if every operator has a fixed and known arityarrow-up-right. For example, *aabc could be interpreted as a2a^2 or a2ba^2*b. Arity indicators may resolve these ambiguities:

  • preorder: -2*3+3*2aabc-2*4dddd*2ee+3fgh*2ij

  • postorder: aa*2bc+3dddd*4ee*2-2fgh+3*3ij*2-2

Binary tree representation of the tree by rotation correspondence. A blue (left) edge designates the first child, whilst the red (right) edge represents the next sibling in the original tree. External nodes are omitted.

The unadorned traversals of the binary tree are:

  • preorder (same as the preorder for the tree): -*+*aabc-*dddd*ee+fgh*ij

  • inorder (same as the postorder for the tree): aa*bc+dddd*ee*-fgh+*ij*-

  • postorder: aacb*ddddee**hgf+-+ji**-

See Exercise 8 for details about how degrees (arity) can be deciphered based on the length of right leaning chains. Observe that the postorder traversal here produces an algebraic mess.

Exercise 6.25

The sequence A138156arrow-up-right enumerates the sum of internal path lengths of all binary trees with NN edges (N1N-1 internal nodes). So, Q5=352/428.381Q_5=352/42 \approx 8.381.

The sequence A288964arrow-up-right enumerates the number of key comparisons to sort all N!N! permutations of NN elements by quicksort. So, C5=888/120=7.4C_5=888/120=7.4.

Explanation for the Correspondence with Quicksort

The total number of key comparisons made by quicksort (using the first element as pivot) when applied to all N!N! permutations of NN distinct elements equals the sum of the internal path lengths of the binary search trees obtained by inserting the elements in the same order. This implicit connection is mentioned in the book (see section 6.8 and Exercise 6.34) as well as in the book's course materialarrow-up-right (see slide 43).

Consider a fixed permutation. In quicksort, each recursive call picks a pivot, let it be the first element of the current subarray just to make the connection with BST clearer, and compares it with every other element in that subarray. Thus, if a pivot corresponds to a node vv whose subtree in the recursion tree (or in the corresponding BST) contains tv|t_v| elements, then the number of comparisons made at that node is tv1|t_v|-1. Summing over all nodes gives the total comparisons for that permutation

c=v(tv1)=(vtv)n.c = \sum_v (|t_v|-1) = \left(\sum_v |t_v|\right) - n.

Now, in the BST built by inserting the elements in the same order, the root is the first element, and the left and right subtrees are built from the elements that are smaller and larger, respectively, in the order they appear. This exactly mirrors the recursion of quicksort. For any tree, the sum of subtree sizes (including each node itself) is related to the internal path length by

vtv=ipl(t)+n,\sum_v |t_v| = ipl(t) + n,

because each node contributes 1 to its own subtree and to the subtree of every ancestor. Combining the two equations yields c=(ipl(t)+n)n=ipl(t)c = (ipl(t) + n) - n = ipl(t). Thus, the number of comparisons for a given permutation equals the internal path length of its associated BST. Summing over all permutations gives the desired equality.

Exercise 6.26

A degenerate tree structure is a chain. Observe that a preorder traversal of such a tree reproduces the original sequence of insertions, because each insertion simply adds the new element at the end of the chain.

Let aiaja_i \neq a_j be two elements whose relative order differs in two different permutations that produce a chain. There must be such a pair. WLOG let i<ji<j, otherwise swap indices. The two BSTs cannot be structurally the same, since the directionality of branching at aia_i must change. Recall that i<ji<j entails that aia_i was inserted before aja_j.

The height of a degenerate tree is h=Nh=N. The total number of degenerate trees is 2h1=2N12^{h-1}=2^{N-1}, since we have two branching choices per edge and the number of edges is N1N-1. Therefore, the probability that a degenerate tree structure will result is 2N1/N!2^{N-1}/N!.

Exercise 6.27

The sequence A076615arrow-up-right enumerates the number of permutations of {1,2,,N}\{1,2,\dots,N\} that result in a binary search tree with the minimum possible height. So, we just need to read out the corresponding entry and divide it by N!N!.

To derive the number T(n)T(n) of insertion sequences that yield a perfectly balanced binary search tree of height nn (with N=2n1N=2^n-1 nodes), we use a recursive combinatorial argument based on the tree structure.

Consider a perfectly balanced tree of the given height:

  • The root is the median element (unique because keys are distinct).

  • The left subtree is itself a perfectly balanced tree of height n1n-1 with 2n112^{n-1} - 1 nodes.

  • The right subtree is identical in size and shape.

For any insertion sequence to produce this tree, the root must come first. The remaining N1=2n2N-1 = 2^n - 2 nodes consist of all nodes from the left and right subtrees. The order in which they’re inserted must respect the internal constraints of each subtree: the nodes of the left subtree must appear in an order that yields the left subtree shape, and similarly for the right. However, the two subsequences can be interleaved arbitrarily.

Let M=2n11M=2^{n-1} - 1 be the size of each subtree, so:

  • The number of valid sequences for the left subtree is T(n1)T(n-1).

  • The number for the right subtree is also T(n1)T(n-1).

  • Once we have fixed the internal orders for each subtree, we need to interleave them into a single sequence of length 2M2M (the positions after the root). The number of ways to interleave two sequences of lengths MM and MM while preserving each internal order is (2MM)\binom{2M}{M}. This counts the choices of which positions in the combined sequence are occupied by left subtree nodes; the left-subtree sequence then fills those positions in order, and the right subtree sequence fills the rest.

This gives

T(n)=(2n22n11)T2(n1)=k=1n1((2nk+12)!((2nk1)!)2) ⁣2k1,T(n) = \binom{2^n - 2}{2^{n-1} - 1} \cdot T^2(n-1)=\prod_{k=1}^{n-1} \left( \frac{(2^{\,n-k+1} - 2)!}{\bigl((2^{\,n-k} - 1)!\bigr)^2} \right)^{\!2^{k-1}},

with T(1)=1T(1)=1. The RHS follows by iterating the LHS. The probability of getting a perfectly balanced tree is T(n)/N!T(n)/N!.

circle-info

As a quick check, we get that T(3)=80T(3)=80, which matches the previously mentioned sequence for N=7N=7 (it’s zero indexed).

Exercise 6.28

Preorder traversal naturally preserves the tree structure, since roots of subtrees are visited first. This ensures that the relative order between ancestors and their descendants are properly maintained, since parents are already in place before their children are inserted. By induction, the tree is reconstructed correctly.

The postorder traversal of the tree below is 1 2. But this sequence would produce a right leaning tree, which is different than the original. Thus, postorder traversal doesn’t preserve the tree structure.

The level order traversal also preserves the structure, since nodes are visited in a manner that retains the relative order between ancestors (nodes visited/inserted earlier) and their descendants. Again, parents are already in place before their children are inserted. By induction, the tree is reconstructed correctly.

Exercise 6.29

CT(z)=tlTtrT(ipl(tl)+ipl(tr)+tl+tr)ztl+tr+1=z[tlTtrT(ipl(tl)+ipl(tr)+tl+tr)ztl+tr]=z[tlTztltrT(ipl(tl)+ipl(tr)+tl+tr)ztr]=z[tlTztltrT(ipl(tl)ztr+ipl(tr)ztr+tlztr+trztr)]=z[tlTztl(ipl(tl)T(z)+CT(z)+tlT(z)+zT(z))]=z[CT(z)T(z)+CT(z)T(z)+zT(z)T(z)+zT(z)T(z)]=2zCT(z)T(z)+2z2T(z)T(z).\begin{align*} C_T(z) &= \sum_{t_l \in \mathcal{T}}\sum_{t_r \in \mathcal{T}}(\text{ipl}(t_l)+\text{ipl}(t_r)+|t_l|+|t_r|)z^{|t_l|+|t_r|+1} \\ &= z \left[\sum_{t_l \in \mathcal{T}}\sum_{t_r \in \mathcal{T}}(\text{ipl}(t_l)+\text{ipl}(t_r)+|t_l|+|t_r|)z^{|t_l|+|t_r|} \right] \\ &= z \left[\sum_{t_l \in \mathcal{T}}z^{|t_l|} \sum_{t_r \in \mathcal{T}}(\text{ipl}(t_l)+\text{ipl}(t_r)+|t_l|+|t_r|)z^{|t_r|} \right] \\ &= z \left[\sum_{t_l \in \mathcal{T}}z^{|t_l|} \sum_{t_r \in \mathcal{T}}(\text{ipl}(t_l)z^{|t_r|}+\text{ipl}(t_r)z^{|t_r|}+|t_l|z^{|t_r|}+|t_r|z^{|t_r|}) \right] \\ &= z \left[\sum_{t_l \in \mathcal{T}}z^{|t_l|} \left(\text{ipl}(t_l)T(z)+C_T(z)+|t_l|T(z)+zT'(z) \right)\right] \\ &= z \left[C_T(z)T(z)+C_T(z)T(z)+zT(z)T'(z)+zT(z)T'(z)\right] \\ &= 2zC_T(z)T(z)+2z^2T(z)T'(z). \end{align*}

Exercise 6.30

CG(z)=tGpl(t)zt=k0t1Gt2GtkG(pl(t1)+pl(t2)++pl(tk)+t1+t2++tk)zt1+t2++tk+1=k0z[t1Gzt1t2Gzt2tkG(pl(t1)+pl(t2)++pl(tk)+t1+t2++tk)ztk]=k0z[t1Gzt1t2Gzt2(pl(t1)G(z)+pl(t2)G(z)++CG(z)+t1G(z)+t2G(z)++zG(z))]=k0z[kCG(z)G(z)k1+kzG(z)G(z)k1)]=zCG(z)+z2G(z)(1G(z))2(kkxk1=1/(1x)2).\begin{align*} C_G(z) &= \sum_{t \in \mathcal{G}}\text{pl}(t)z^{|t|} \\ &= \sum_{k \ge 0}\sum_{t_1 \in \mathcal{G}}\sum_{t_2 \in \mathcal{G}} \dots \sum_{t_k \in \mathcal{G}} (\text{pl}(t_1) +\text{pl}(t_2)+ \dots + \text{pl}(t_k)+|t_1|+|t_2|+ \dots +|t_k|)z^{|t_1|+|t_2|+ \dots +|t_k|+1} \\ &= \sum_{k \ge 0} z \left[\sum_{t_1 \in \mathcal{G}} z^{|t_1|}\sum_{t_2 \in \mathcal{G}}z^{|t_2|} \dots \sum_{t_k \in \mathcal{G}} (\text{pl}(t_1) +\text{pl}(t_2)+ \dots + \text{pl}(t_k)+|t_1|+|t_2|+ \dots +|t_k|)z^{|t_k|} \right]\\ &= \sum_{k \ge 0} z \left[\sum_{t_1 \in \mathcal{G}} z^{|t_1|}\sum_{t_2 \in \mathcal{G}}z^{|t_2|} \dots \bigg(\text{pl}(t_1)G(z)+\text{pl}(t_2)G(z)+ \dots + C_G(z)+|t_1|G(z)+|t_2|G(z)+ \dots +zG'(z)\bigg) \right] \\ &\dots \\ &= \sum_{k \ge 0} z \left[kC_G(z)G(z)^{k-1}+kzG'(z)G(z)^{k-1}) \right] \\ &= \frac{zC_G(z)+z^2G'(z)}{(1-G(z))^2} \qquad \text{$\left(\sum_k kx^{k-1}=1/(1-x)^2\right)$}. \end{align*}

Exercise 6.31

We can derive the relationship by monitoring how node levels translate between the two tree structures and using the inherent symmetry of binary trees.

A general tree GG of NN nodes maps to a binary tree TT of NN nodes where the root only has a left subtree of size N1N-1. A key insight is that the level of a node vv in GG, denoted as levelG(v)\text{level}_G(v), matches the number of left branches on the path from the root to vv in TT, denoted as left-branchesT(v)\text{left-branches}_T(v). This follows from the first-child, next-sibling rule of rotation correspondence. Thus, we can express the path length of GG in terms of left branches in TT

pl(G)=vGlevelG(v)=vTleft-branchesT(v).\text{pl}(G)=\sum_{v \in G} \text{level}_G(v) = \sum_{v \in T} \text{left-branches}_T(v).

The number of left branches of any node in TT is one more than this number taken in the root's left subtree tlt_l. This gives

pl(G)=vtl(1+left-branchestl(v))=N1+vtlleft-branchestl(v). \text{pl}(G)= \sum_{v \in t_l} (1+\text{left-branches}_{t_l}(v))=N-1+ \sum_{v \in t_l} \text{left-branches}_{t_l}(v).

Now we look at the average over all randomly generated trees. Let E[pl(GN)]\mathbb{E}[\text{pl}(G_N)] be the expected path length of a random general tree of size NN, and E[ipl(TN)]\mathbb{E}[\text{ipl}(T_N)] be the expected internal path length of a random binary tree of size NN. On average, we expect that the total number of left branches across all paths in TT equals the total number of right branches. Since the internal path length of a binary tree is the sum of left and right branches, as specified above, we have

E[vtlleft-branchestl(v)]=12E[ipl(TN1)].\mathbb{E}\left[\sum_{v \in t_l} \text{left-branches}_{t_l}(v)\right] = \frac{1}{2} \mathbb{E}[\text{ipl}(T_{N-1})].

This gives

E[pl(GN)]=N1+12E[ipl(TN1)].\mathbb{E}[\text{pl}(G_N)] =N-1+\frac{1}{2} \mathbb{E}[\text{ipl}(T_{N-1})].

This result matches the formula from the book, once we substitute the expression from Theorem 6.3 into the RHS.

Exercise 6.32

In Program 1.2, the pivot selected for each subarray is its final element. Accordingly, it is necessary to ensure that the last element of each subarray becomes the root of the subsequent subtree. It should be noted that during each iteration, the pivot is exchanged with the border element, which must be considered in the overall process. Possible permutations corresponding to Figure 6.12, listed from left to right, are as follows:

  1. AA,AB,CN,MS,EF,MC,JB,PD,AL

  2. PD,MS,MC,JB,EF,CN,AL,AB,AA

  3. AA,AL,AB,EF,CN,MS,MC,PD,JB

Exercise 6.33 🌟

circle-check

The probability that the successful search cost is 2 equals the probability that the key matches the root of the left or right subtree of the tree's root. Due to symmetry, these two probabilities are the same. Observe that subtrees of random BSTs are themselves random BSTs, so the randomness property is fully preserved (like, in Quicksort). Let SkS_k be a subtree of size kk. The joint probability can be expressed as

Pr{key = root of Sk}=Pr{(S=k (keyS (key = root of S)}=1NkN1k=1N2.\begin{align*} \Pr\{\text{key = root of $S_k$}\} &= \Pr\{\text{($|S|=k$) $\cap$ (key$\in S$) $\cap$ (key = root of $S$)}\} \\ &= \frac{1}{N} \cdot\frac{k}{N} \cdot \frac{1}{k} \\ &= \frac{1}{N^2}. \end{align*}

We ignore k=0k=0, since only successful searches are considered. The final probability is computed by summing over 0<k<N0<k<N and multiplying by two

qN(2)=2(N1)N2.q_N(2)= \frac{2(N-1)}{N^2}.
circle-info

An important aspect of the preceding analysis merits attention. It’s established that binary search trees constructed from random permutations generally exhibit balanced structures. The reason top subtree sizes adhere to a uniform probability distribution is that the root node may assume any value within the interval [1,N][1, N] with equal likelihood, thereby determining the respective sizes of its left and right subtrees.

Exercise 6.34

The following Python script creates the requested plot.

Figure 1.4 actually shows the histogram of construction costs, which are tightly related to search costs (see the proof of Theorem 6.6). The alignment is very good, as the average is between 11 and 12 (it’s around 12,000 in Figure 1.4).

Exercise 6.35

The following Python script creates the requested plot.

The distribution looks slightly skewed to the right, also visible in Figure 1.4; while most nodes are around the average level, few branches in an unbalanced BST can get unusually long.

Exercise 6.36

By iterating the recurrence for pN(u)p_N(u), we can get the following closed-form solution:

pN(u)=k=1Nk1+2uk+1 with p0(u)=1.p_N(u) = \prod_{k=1}^N \frac{k - 1 + 2u}{k + 1} \quad \text{ with $p_0(u)=1$}.

Let’s take the natural logarithm of both sides to make differentiation much easier

lnpN(u)=k=1Nln(k1+2u)k=1Nln(k+1).\ln p_N(u) = \sum_{k=1}^N \ln(k - 1 + 2u) - \sum_{k=1}^N \ln(k + 1).

The first derivative is

ddulnpN(u)=pN(u)pN(u)=k=1N2k1+2u.\frac{d}{du} \ln p_N(u) = \frac{p_N'(u)}{p_N(u)} = \sum_{k=1}^N \frac{2}{k - 1 + 2u}.

The average is pN(1)=2(HN+11)p_N'(1) = 2(H_{N+1} - 1). The second derivate is

d2du2lnpN(u)=pN(u)pN(u)pN(u)2pN(u)2=dduk=1N2k1+2u=k=1N4(k1+2u)2.\frac{d^2}{du^2} \ln p_N(u) = \frac{p_N''(u)p_N(u) - p_N'(u)^2}{p_N(u)^2}= \frac{d}{du} \sum_{k=1}^N \frac{2}{k - 1 + 2u} = \sum_{k=1}^N \frac{-4}{(k - 1 + 2u)^2}.

Evaluating it at u=1u=1 gives

pN(1)pN(1)2=k=1N4(k+1)2=4(HN+1(2)1). p_N''(1) - p_N'(1)^2= \sum_{k=1}^N \frac{-4}{(k + 1)^2}= -4(H_{N+1}^{(2)} - 1).

Recall that pN(1)=1p_N(1)=1, since pN(u)p_N(u) represents a probability distribution. Substituting all components back into the formula for variance, we get the equation from the book.

Exercise 6.37

We reuse many ideas from Exercise 6.33. Searching for a randomly chosen key in a tree of size NN encompasses the following cases:

  1. We hit the root with probability 1N\frac{1}{N}.

  2. Because the left subtree of size kk is uniformly distributed, and the probability that the target key happens to be in that specific left subtree is kN\frac{k}{N}, the search costs 11 (for the root) plus the successful search cost within a tree of size kk.

  3. A successful search in a singleton tree entails s1(u)=us_1(u)=u.

This gives

sN(u)=1Nu+k=0N11NkNusk(u)+k=0N11NN1kNusN1k(u)=uN+2uN2k=0N1ksk(u)(due to symmetry)=uN2+(N1)(N1+2u)N2sN1(u)(after eliminating summation).\begin{align*} s_N(u) &= \frac{1}{N} u + \sum_{k=0}^{N-1} \frac{1}{N} \cdot \frac{k}{N} \cdot u \cdot s_k(u) + \sum_{k=0}^{N-1} \frac{1}{N} \cdot \frac{N-1-k}{N} \cdot u \cdot s_{N-1-k}(u) \\ &= \frac{u}{N} + \frac{2u}{N^2} \sum_{k=0}^{N-1} k \cdot s_k(u) && \text{(due to symmetry)} \\ &= \frac{u}{N^2} + \frac{(N-1)(N-1+2u)}{N^2} s_{N-1}(u) && \text{(after eliminating summation)}. \end{align*}

Differentiating both sides using the product rule gives

N2sN(u)=1+(N1)[2sN1(u)+(N1+2u)sN1(u)].N^2 s_N'(u) = 1 + (N-1) \left[ 2 \cdot s_{N-1}(u) + (N-1+2u) \cdot s_{N-1}'(u) \right].

Evaluated at u=1u=1, we get the average (see Exercise 2.14)

sN(1)=2N1N2+N21N2sN1(1)=2HN3+2HN/N.\begin{align*} s_N'(1) &= \frac{2N - 1}{N^2} + \frac{N^2 - 1}{N^2} s_{N-1}'(1) \\ &= \boxed{2H_N – 3 + 2H_N/N}. \end{align*}

The second derivative is

N2sN(u)=(N1)[2sN1(u)+2sN1(u)+(N1+2u)sN1(u)]=(N1)[4sN1(u)+(N1+2u)sN1(u)].\begin{align*} N^2 s_N''(u) &= (N-1) \left[ 2s_{N-1}'(u) + 2s_{N-1}'(u) + (N-1+2u) s_{N-1}''(u) \right] \\ &= (N-1) \left[ 4s_{N-1}'(u) + (N-1+2u) s_{N-1}''(u) \right]. \end{align*}

Evaluated at u=1u=1, we get (see Exercise 2.14 and Exercise 3.8)

N2sN(1)=4(N1)sN1(1)+(N21)sN1(1)sN(1)=(1+1N)(4HN24HN(2)12HN+16)+8HN16N.\begin{align*} N^2 s_N''(1) &= 4(N-1) s_{N-1}'(1) + (N^2 - 1) s_{N-1}''(1) \\ &\therefore s_N''(1) = \left(1 + \frac{1}{N}\right) \left(4H_N^2 - 4H_N^{(2)} - 12H_N + 16\right) + \frac{8H_N - 16}{N}. \end{align*}

After substituting all components into the formula for the variance, we get

σ2=2HN4HN(2)+4+10HN4HN24HN(2)N4HN2N2=N+1N(2HN+14HN+1(2)+2)N+1N2(2HN+12)2+2.\begin{align*} \sigma^2 &= 2H_N - 4H_N^{(2)} + 4 + \frac{10H_N - 4H_N^2 - 4H_N^{(2)}}{N} - \frac{4H_N^2}{N^2} \\ &= \boxed{\frac{N+1}{N} \left(2H_{N+1} - 4H_{N+1}^{(2)} + 2\right) - \frac{N+1}{N^2} \left(2H_{N+1} - 2\right)^2 + 2}. \end{align*}

Exercise 6.38 🌟

circle-check

Let sN(u)s_N(u) be the PGF for successful search and pN(u)p_N(u) for unsuccessful search (see the previous exercise). We define these PGFs as follows:

  • A successful search costs l(v)+1l(v)+1 (where l(v)l(v) is the internal node's level), so sN(u)=1N Internalul(v)+1s_N(u) = \frac{1}{N} \sum_{\text{v $\in$ Internal}} u^{l(v)+1}.

  • An unsuccessful search costs l(v)l(v) (where l(v)l(v) is the external node's level), so pN(u)=1N+1 Externalul(v)p_N(u) = \frac{1}{N+1} \sum_{\text{v $\in$ External}} u^{l(v)}.

Establishing the Relationship between PGFs

Let PN(u)=(N+1)pN(u)P_N(u)=(N+1)p_N(u) and SN(u)=NsN(u)/uS_N(u)=Ns_N(u)/u. We aim to find a way to express these two quantities, by inspecting how any binary tree is built from scratch. In an empty tree (N=0N=0), we have that PN(u)=1P_N(u)=1, a tree with only one external node. Following the NNth insertion, an external node at level ll was substituted with a new node, resulting in the emergence of two additional external nodes. Therefore,

ΔSN(u)=ul and ΔPN(u)=2ul+1ul=(2u1)ΔSN(u).\Delta S_N(u)=u^l \text{ and } \Delta P_N(u)=2u^{l+1}-u^l=(2u-1)\Delta S_N(u).

We can conclude, and prove by induction using the above statements, that the following identity holds

PN(u)=1+(2u1)SN(u)u(N+1)pN(u)uPN(u)=u+(2u1)NsN(u)uSN(u).P_N(u)=1+(2u-1)S_N(u) \\[0.3cm] \therefore \underbrace{u(N+1)p_N(u)}_{uP_N(u)} = u + (2u-1)\underbrace{Ns_N(u)}_{uS_N(u)}.
circle-info

If we plug u=1/2u=1/2 into  Externalul(v)=1+(2u1) Internalul(v)\sum_{\text{v $\in$ External}} u^{l(v)}=1+(2u-1)\sum_{\text{v $\in$ Internal}} u^{l(v)}, we get 1, which is the Kraft equality that describes the profile of a binary tree (see Exercise 18).

Computing Moments

Now, we follow the standard approach for calculating moments. The first derivative is

(N+1)(pN(u)+upN(u))=1+N(2sN(u)+(2u1)sN(u)).(N+1)(p_N(u) + up_N'(u)) = 1 + N(2s_N(u) + (2u-1)s_N'(u)).

Evaluating at u=1u=1 gives the average cost of a successful search

(N+1)(1+pN(1))=1+N(2+sN(1))sN(1)=N+1NpN(1)1=2HN3+2HN/N.(N+1)(1 + p_N'(1)) = 1 + N(2 + s_N'(1)) \\[0.3cm] \therefore s_N'(1) = \frac{N+1}{N} p_N'(1) - 1=2H_N-3+2H_N/N.

Recall that pN(1)=sN(1)=1p_N(1)=s_N(1)=1 and we already know pN(1)p'_N(1) from Theorem 6.6. The second derivative is

(N+1)(2pN(u)+upN(u))=N(4sN(u)+(2u1)sN(u)).(N+1)(2p_N'(u) + up_N''(u)) = N(4s_N'(u) + (2u-1)s_N''(u)).

Evaluating at u=1u=1 gives

(N+1)(2pN(1)+pN(1))=N(4sN(1)+sN(1))sN(1)=N+1N(2pN(1)+pN(1))4sN(1).(N+1)(2p_N'(1) + p_N''(1)) = N(4s_N'(1) + s_N''(1)) \\[0.3cm] \therefore s_N''(1) = \frac{N+1}{N}(2p_N'(1) + p_N''(1)) - 4s_N'(1).

The variance of a successful search is σS2=sN(1)+sN(1)sN(1)2\sigma^2_S= s_N''(1) + s_N'(1) - s_N'(1)^2. Thus,

σS2=N+1N(2pN(1)+pN(1))3sN(1)sN(1)2.=N+1NσU2N+1N2pN(1)2+2=N+1N(2HN+14HN+1(2)+2)N+1N2(2HN+12)2+2.\begin{align*} \sigma^2_S &= \frac{N+1}{N}(2p_N'(1) + p_N''(1)) - 3s_N'(1) - s_N'(1)^2. \\[0.3cm] &= \frac{N+1}{N} \sigma^2_U - \frac{N+1}{N^2} p_N'(1)^2 + 2 \\[0.3cm] &= \frac{N+1}{N} \left(2H_{N+1} - 4H_{N+1}^{(2)} + 2\right) - \frac{N+1}{N^2} \left(2H_{N+1} - 2\right)^2 + 2. \end{align*}

Exercise 6.39 🌟

circle-check
circle-exclamation

Theorem 6.7 is built to relate a global, additive parameter c(t)c(t) to a local toll function e(t)e(t).

  • EG(z)E_G(z) is the CGF for the local toll e(t)e(t). In our case, it’s the degree of the root of a tree.

  • CG(z)C_G(z) is the CGF for the global cost c(t)c(t). The global cost is the sum of the local tolls for every single node in the entire tree.

To find the average of just that local toll across all trees of size NN, we must extract the coefficient from EG(z)E_G(z) and divide it by the total number of trees:

Average Root Degree=[zN]EG(z)GN.\text{Average Root Degree}=\frac{[z^N]E_G(z)}{G_N}.

This is different from what the Corollary says in the book. Our local toll function e(t)e(t) is simply the degree of the root of tt. The recurrence relation of c(t)c(t) entails that we’re summing the root degree of the main tree, plus the root degrees of its subtrees, and so on. This turns out to be the total number of edges in a tree, hence c(t)=t1c(t) = |t| - 1. Now, we can work backwards, by first finding the CG(z)C_G(z). We have

CG(z)=tGc(t)zt=tG(t1)zt=zG(z)G(z)=z14z114z2.\begin{align*} C_G(z) &= \sum_{t \in \mathcal{G}} c(t) z^{|t|} \\ &= \sum_{t \in \mathcal{G}} (|t|-1) z^{|t|} \\ &= zG'(z) - G(z) \\ &= \frac{z}{\sqrt{1-4z}} - \frac{1 - \sqrt{1-4z}}{2}. \end{align*}

By using the identity from Theorem 6.7, we get

EG(z)=2CG(z)1+114z=2CG(z)14z14z+1=12z14z14z+1.\begin{align*} E_G(z) &= \frac{2 C_G(z)}{1 + \frac{1}{\sqrt{1-4z}}} \\ &= \frac{2 C_G(z) \sqrt{1-4z}}{\sqrt{1-4z} + 1} \\ &= \frac{1 - 2z - \sqrt{1-4z}}{\sqrt{1-4z} + 1}. \end{align*}

We can use the substitutions z=G(z)(1G(z))z = G(z)(1-G(z)) and 14z=12G(z)\sqrt{1-4z} = 1 - 2G(z) to define everything in terms of G(z)G(z) (see also Exercise 41)

EG(z)=G(z)3z.E_G(z) =\frac{G(z)^3}{z}.

The total sum of root degrees for all trees of size NN is

[zN]EG(z)=[zN+1]G(z)3.[z^N] E_G(z) = [z^{N+1}] G(z)^3.

Using the Lagrange inversion theorem for Catalan trees ([zN]G(z)k=k2Nk(2NkN)[z^N] G(z)^k = \frac{k}{2N-k} \binom{2N-k}{N}), we have

[zN]EG(z)=32N1(2N1N+1).[z^N] E_G(z) = \frac{3}{2N-1} \binom{2N-1}{N+1}.

After dividing by GN=TN1G_N=T_{N-1}, we get that the average number of children of the root in a random Catalan tree of NN nodes is 3(N1)/(N+1) 3(N-1)/(N+1). This perfectly matches the examples in Figure 6.3 from the book.

Exercise 6.40

The toll CGF follows from Exercise 4, hence

[zN]EG(z)=GN1    EG(z)=zG(z).[z^N]E_G(z)=G_{N-1} \implies E_G(z)=zG(z).

Applying Theorem 6.7 gives

CG(z)=12EG(z)(1+114z)=12zG(z)(14z+114z)=12z(114z2)(1+14z14z)=z214z.\begin{align*} C_G(z) &= \frac{1}{2} E_G(z) \left( 1 + \frac{1}{\sqrt{1-4z}} \right) \\ &= \frac{1}{2} z G(z) \left( \frac{\sqrt{1-4z} + 1}{\sqrt{1-4z}} \right) \\ &= \frac{1}{2} z \left( \frac{1 - \sqrt{1-4z}}{2} \right) \left( \frac{1 + \sqrt{1-4z}}{\sqrt{1-4z}} \right) \\ &= \frac{z^2}{\sqrt{1-4z}}. \end{align*}

Based on the Corollary of Theorem 6.7, the average number of 1-child nodes per tree is

[zN]CG(z)GN=N(N1)4N6.\frac{[z^N]C_G(z)}{G_N}=\frac{N(N-1)}{4N-6}.

From Figure 6.3 in the book, the answer is 10/710/7 for N=5N=5. This perfectly matches our formula. To get the proportion of nodes, we need to divide this amount by NN. Therefore, the answer is (N1)/(4N6)(N-1)/(4N-6).

Exercise 6.41

The toll CGF follows from Exercise 5, hence EG(z)=zG(z)k.E_G(z)=zG(z)^k. Applying Theorem 6.7 gives

CG(z)=12EG(z)(1+114z)=12zG(z)k(14z+114z)=12z(114z2)k(1+14z14z).\begin{align*} C_G(z) &= \frac{1}{2} E_G(z) \left( 1 + \frac{1}{\sqrt{1-4z}} \right) \\ &= \frac{1}{2} z G(z)^k \left( \frac{\sqrt{1-4z} + 1}{\sqrt{1-4z}} \right) \\ &= \frac{1}{2} z \left( \frac{1 - \sqrt{1-4z}}{2} \right)^k \left( \frac{1 + \sqrt{1-4z}}{\sqrt{1-4z}} \right). \end{align*}

Extracting coefficients from this GF is quite a mess. Instead, we’ll apply the extended Lagrange inversion theoremarrow-up-right (see also the remark in Exercise 39). For Catalan trees, the standard functional equation is

G(z)=zf(G(z)), where f(u)=11u.G(z) = z f(G(z)), \text{ where }f(u) = \frac{1}{1-u}.

Consequently, z=G(z)(1G(z))z = G(z)(1 - G(z)) and Theorem 6.7 simplifies to

CG(z)=EG(z)1zf(G(z))=G(z)k+1(1G(z))1zf(G(z)).C_G(z) = \frac{E_G(z)}{1 - zf'(G(z))}=\frac{G(z)^{k+1}(1-G(z))}{1 - zf'(G(z))}.

Let’s express the numerator in terms of g(u)g(u), so the coefficient extraction is

[zN]g(G(z))1zf(G(z))=[uN]g(u)f(u)N.[z^N] \frac{g(G(z))}{1 - zf'(G(z))} = [u^N] g(u) f(u)^N.

This gives

[zN]CG(z)=[uN]uk+1(1u)g(u)(1u)N=[uN]uk+1(1u)(N1)=[uNk1](1u)(N1)=(2Nk3Nk1).\begin{align*} [z^N] C_G(z) &= [u^N] \underbrace{u^{k+1}(1-u)}_{g(u)} (1-u)^{-N} \\ &= [u^N] u^{k+1} (1-u)^{-(N-1)} \\ &= [u^{N-k-1}] (1-u)^{-(N-1)} = \binom{2N - k - 3}{N - k - 1}. \end{align*}

The exact proportion of nodes with kk children is (see also the previous exercise)

(2Nk3Nk1)NGN=(2Nk3Nk1)(2N2N1). \frac{\dbinom{2N - k - 3}{N - k - 1}}{NG_N}= \frac{\dbinom{2N - k - 3}{N - k - 1}}{\dbinom{2N-2}{N-1}}.

Exercise 6.42

We already know from the book, that the average number of leaves in a random binary Catalan tree of NN nodes is N(N+1)/(2(2N1))N(N+1)/(2(2N-1)). Hence, the fraction of leaves is (N+1)/(2(2N1))(N+1)/(2(2N-1)).

To compute the fraction of internal nodes with a single child, we apply Theorem 6.7 with the toll CGF ET(z)=2z(T(z)1)E_T(z)=2z(T(z)-1) (see also Exercise 40). In this instance, it’s necessary to omit the empty subtree, while permitting either the left or right branch to contain nodes. This gives

CT(z)=ET(z)14z=2z(T(z)1)14z=12z14z1.\begin{align*} C_T(z) &= \frac{E_T(z)}{\sqrt{1-4z}} \\ &= \frac{2z(T(z)-1)}{\sqrt{1-4z}} \\ &= \frac{1 - 2z}{\sqrt{1-4z}} - 1. \end{align*}

The fraction of nodes of this type is

[zN]CT(z)NTN=(2NN)N12N1NN+1(2NN)=(N1)(N+1)N(2N1). \frac{[z^N]C_T(z)}{NT_N}=\frac{\binom{2N}{N} \frac{N-1}{2N-1}}{\frac{N}{N+1}\binom{2N}{N}} = \frac{(N-1)(N+1)}{N(2N-1)}.

To compute the fraction of internal nodes with two children, we can apply Theorem 6.7 with the toll CGF ET(z)=z(T(z)1)2E_T(z)=z(T(z)-1)^2. But there is a simpler way. Namely, the sum of fractions must be 1, since we cover all cases. Therefore, the fraction of nodes with two children is

1N+12(2N1)(N1)(N+1)N(2N1)=(N1)(N2)2N(2N1).1 - \frac{N+1}{2(2N-1)} - \frac{(N-1)(N+1)}{N(2N-1)}= \frac{(N-1)(N-2)}{2N(2N-1)}.

Exercise 6.43

We already know from the book, that the average number of leaves in a random BST of NN nodes is (N+1)/3(N+1)/3. Hence, the fraction of leaves is (N+1)/(3N)(N+1)/(3N).

To compute the fraction of internal nodes with a single child, we apply Theorem 6.7 with eN=2/Ne_N=2/N for N2N \ge 2. This follows from the fact that the root of a subtree of size NN must be the smallest or largest element in order to have a single child. Thus,

EB(z)=N22NzN=2ln(11z)2zEB(z)=2z1z.\begin{align*} E_B(z) &= \sum_{N \ge2} \frac{2}{N} z^N = 2 \ln\left(\frac{1}{1-z}\right) - 2z \\ E'_B(z) &= \frac{2z}{1-z}. \end{align*}

This gives the same fraction as for leaves

CB(z)=1(1z)20z(1x)2(2x1x)dx=z2(1z)22z33(1z)2[zN]CB(z)=N+13.\begin{align*} C_B(z) &= \frac{1}{(1-z)^2} \int_0^z (1-x)^2 \left(\frac{2x}{1-x}\right) dx \\ &= \frac{z^2}{(1-z)^2} - \frac{2z^3}{3(1-z)^2} \\ &\therefore [z^N]C_B(z) = \frac{N+1}{3}. \end{align*}

The fraction of nodes with two children is

12N+13N=N23N.1-2 \cdot \frac{N+1}{3N}=\frac{N-2}{3N}.

Exercise 6.44

For all subtasks, the mechanics of calculating the variance is based on Table 3.5. We omit the technical details and only present the derivation of BGFs with final results for variances.

The BGF for the number of leaves in a random binary Catalan tree is given in Section 5.4 of the book

T(z,u)+z=1+zu+zT(z,u)2.T(z,u)+z=1+zu+zT(z,u)^2.

Therefore,

σN2=N(N+1)(N1)(N2)2(2N1)2(2N3).\boxed{\sigma_N^2 = \frac{N(N+1)(N-1)(N-2)}{2(2N-1)^2(2N-3)}}.

The symbolic method may help a lot in finding the BGF for the number of leaves in a random general Catalan tree. A tree is either a leaf or a root attached to a non-empty forest. Therefore,

G=Z+Z×(FE).G=Z+Z \times (\mathcal{F}-E).

We should mark the leaf, so this translates to

G(z,u)=zu+zG(z,u)1G(z,u)=zu+z(1u)G(z,u)+G(z,u)2.G(z,u) = zu + \frac{zG(z,u)}{1 - G(z,u)}= zu + z(1-u)G(z,u) + G(z,u)^2.

The variance is

σN2=N(N2)4(2N3).\boxed{\sigma_N^2 =\frac{N(N-2)}{4(2N-3)}}.

Setting up the BGF in case of a random BST proceeds along the lines of the computation of moments presented in Section 3.10 (see the Quicksort distribution subsection) of the book. For the latter, consult also the derivation shown in Exercise 3.67. Let QN(u)Q_N(u) be the PGF for the number of leaves in a random BST of size NN. The base cases are:

  • An empty tree has no leaves, so Q0(u)=0Q_0(u)=0.

  • A singleton tree has one leaf, so Q1(u)=uQ_1(u)=u.

Subtree sizes are uniformly distributed with probability 1/N1/N, as emphasized in the book, hence (see also Exercise 3.62)

QN(u)=1Nj=0N1Qj(u)QN1j(u) for N1.Q_N(u) = \frac{1}{N} \sum_{j=0}^{N-1} Q_j(u) Q_{N-1-j}(u) \quad \text{ for $N\ge1$}.

As in the proof of Theorem 6.3, to simplify this recurrence, we move to an "enumerative" approach

NQN(u)=j=0N1Qj(u)QN1j(u).N Q_N(u) = \sum_{j=0}^{N-1} Q_j(u) Q_{N-1-j}(u).

Let our exponential BGF be Q(z,u)=N0QN(u)zN.Q(z,u) = \sum_{N\ge0} Q_N(u) z^N. If we multiply our recurrence by zN1z^{N-1} and sum on N1N \ge 1, the LHS transforms into

N1NQN(u)zN1=Qz(z,u).\sum_{N\ge1} N Q_N(u) z^{N-1} = Q_z(z,u).

The RHS becomes a self-convolution (after making an adjustment for N=1N=1 to balance it with the LHS)

N1(j=0N1Qj(u)QN1j(u))zN1=Q(z,u)2+u1.\sum_{N\ge1} \left(\sum_{j=0}^{N-1} Q_j(u) Q_{N-1-j}(u)\right)z^{N-1}= Q(z,u)^2 + u - 1.

We arrive at the following ODE

Qz(z,u)=Q(z,u)2+u1,Q_z(z,u)=Q(z,u)^2 + u - 1,

with initial condition Q(0,u)=1.Q(0,u) = 1. From here, we virtually follow the same steps as in Exercise 3.67. The variance is

σN2=2(N+1)45 for N4.\boxed{\sigma_N^2 =\frac{2(N+1)}{45} \quad \text{ for $N \ge 4$}}.

Exercise 6.45

We assume that e(t)=f(t)=eNe(t)=f(|t|)=e_N, which is usually the case with almost all toll functions. Let's define a toll BGF

E(z,u)=N0zNueN.E(z,u) = \sum_{N\ge0} z^Nu^{e_N}.

The BGF version of Theorem 6.7 uses the Hadamard productarrow-up-right of GFs

FT(z,u)=1+E(z,u)z(zFT(z,u)2)(binary Catalan trees)FG(z,u)=E(z,u)z(z1FG(z,u))(general Catalan trees)zzFB(z,u)=E(z,u)z(zFB(z,u)2)(binary search trees).\begin{align*} F_T(z,u) &= 1 + E(z,u) \odot_z \left(z F_T(z,u)^2 \right) && \text{(binary Catalan trees)} \\ F_G(z,u) &= E(z,u) \odot_z \left( \frac{z}{1 - F_G(z,u)} \right) && \text{(general Catalan trees)} \\ z\frac{\partial}{\partial z} F_B(z,u) &= E(z,u) \odot_z \left( zF_B(z,u)^2 \right) && \text{(binary search trees)}. \end{align*}

Proof:

Let T\mathcal{T} be the set of all binary Catalan trees. We have

FT(z,u)=tTztuc(t)=z0u0+E(z,u)z(tl,trTztl+tr+1uc(tl)+c(tr))=1+E(z,u)z(ztlTztluc(tl)trTztruc(tr))=1+E(z,u)z(zFT(z,u)2).\begin{align*} F_T(z,u) &= \sum_{t \in \mathcal{T}} z^{|t|}u^{c(t)} \\ &= z^0u^0+ E(z,u) \odot_z \left(\sum_{t_l, t_r \in \mathcal{T}} z^{|t_l|+|t_r|+1} u^{c(t_l) + c(t_r)}\right) \\ &= 1 +E(z,u) \odot_z \left(z\sum_{t_l \in \mathcal{T}} z^{|t_l|} u^{c(t_l)} \sum_{t_r \in \mathcal{T}} z^{|t_r|} u^{c(t_r)} \right)\\ &= 1 +E(z,u) \odot_z \left(zF_T(z,u)^2\right). \end{align*}

Let G\mathcal{G} be the set of all general Catalan trees. We have

FG(z,u)=tGztuc(t)=E(z,u)zu(k0t1,t2,,tkGzt1+t2++tk+1uc(t1)+c(t2)++c(tk))=E(z,u)z(z1FG(z,u)).\begin{align*} F_G(z,u) &= \sum_{t \in \mathcal{G}} z^{|t|}u^{c(t)} \\ &= E(z,u) \odot_zu \left(\sum_{k \ge 0}\sum_{t_1,t_2,\dots,t_k \in \mathcal{G}} z^{|t_1|+|t_2|+ \dots +|t_k|+1} u^{c(t_1)+c(t_2)+ \dots +c(t_k)} \right)\\ &= E(z,u) \odot_z \left(\frac{z}{1 - F_G(z,u)}\right). \end{align*}

For BSTs we start with the PGF

QN(u)=ueNNj=0N1Qj(u)QN1j(u) for N1,Q_N(u) = \frac{u^{e_N}}{N} \sum_{j=0}^{N-1} Q_j(u) Q_{N-1-j}(u) \quad \text{ for $N\ge1$},

and continue as in the previous exercise. We don’t need anymore that correction factor for N=1N=1, since the toll function eNe_N handles all cases. Finally, instead of multiplying by zN1z^{N-1} we must multiply by zNz^N to be able to smoothly apply the toll BGF. This produces that extra zz factor on both sides.

Exercise 6.46

We start by deriving the OGF for Fibonacci polynomials

Q(u)=h0Fh+1(z)uh=F1(z)+F2(z)u+h2(Fh(z)zFh1(z))uh=1+u+uh2Fh(z)uh1zu2h2Fh1(z)uh2=1+u+u(Q(u)1)zu2Q(u)=11u+zu2.\begin{align*} Q(u) &= \sum_{h \ge 0} F_{h+1}(z) u^h \\ &= F_1(z) + F_2(z)u + \sum_{h \ge 2} (F_h(z) - z F_{h-1}(z)) u^h \\ &= 1 + u + u \sum_{h \ge 2} F_h(z) u^{h-1} - zu^2 \sum_{h \ge 2} F_{h-1}(z) u^{h-2} \\ &= 1 + u + u(Q(u) - 1) - zu^2 Q(u) \\ &= \boxed{\frac{1}{1 - u + zu^2}}. \end{align*}

Now, we expand this back into a series, so that we can extract the coefficient Fh+1(z)F_{h+1}(z).

Q(u)=11(1zu)u=k0uk(1zu)k=k0ukj=0k(kj)(zu)j=k0j=0k(kj)(z)juk+j[uh]Q(u)=Fh+1(z)=j(hjj)(z)j.\begin{align*} Q(u) &= \frac{1}{1 - (1 - zu)u} \\ &= \sum_{k \ge 0} u^k (1 - zu)^k \\ &= \sum_{k \ge 0} u^k \sum_{j=0}^k \binom{k}{j} (-zu)^j \\ &= \sum_{k \ge 0} \sum_{j=0}^k \binom{k}{j} (-z)^j u^{k+j} \\ &\therefore \boxed{[u^h] Q(u)=F_{h+1}(z) = \sum_{j} \binom{h-j}{j} (-z)^j}. \end{align*}

Exercise 6.47

First, we need to setup the equation for applying the Lagrange inversion theoremarrow-up-right (see the third case as well as the remark in Exercise 39)

G(z)G[h2](z)=14zuh1uh=14zk1ukh=k11u1+uukhg(u).\begin{align*} G(z) - G^{[h-2]}(z) &= \sqrt{1-4z} \frac{u^h}{1 - u^h} \\[0.3cm] &= \sqrt{1-4z} \sum_{k \ge 1} u^{kh} \\ &= \sum_{k \ge 1} \underbrace{\frac{1-u}{1+u} u^{kh}}_{g(u)}. \end{align*}

The last step follows from identities stated in the book (relating uu, zz and G(z)G(z)), so that we can conclude

z=u(1+u)2and14z=1u1+u.z = \frac{u}{(1+u)^2} \quad \text{and} \quad \sqrt{1-4z} = \frac{1-u}{1+u}.

We want to extract the coefficient [zN+1][z^{N+1}] from the above summand. The Lagrange inversion theorem says that if z=f(u)z = f(u), then

[zN+1]g(A(z))=1N+1[uN](g(u)(uf(u))N+1).[z^{N+1}] g(A(z)) = \frac{1}{N+1} [u^N] \left( g'(u) \left(\frac{u}{f(u)}\right)^{N+1} \right).

The first derivate of gg is

g(u)=2(1+u)2ukh+1u1+ukhukh1.g'(u) = \frac{-2}{(1+u)^2} u^{kh} + \frac{1-u}{1+u} kh u^{kh-1}.

Substituting it back into the theorem's formula, we get

[zN+1]g(A(z))=1N+1[uN](2ukh(1+u)2N+kh(1u)ukh1(1+u)2N+1)=2N+1[uNkh](1+u)2N+khN+1[uN+1kh](1u)(1+u)2N+1=2N+1(2NNkh)+khN+1((2N+1N+1kh)(2N+1Nkh))=2N+1(2NNkh)+khN+1((2NN+1kh)(2NN1kh))=(2NN+1kh)2(2NNkh)+(2NN1kh).\begin{align*} [z^{N+1}] g(A(z)) &= \frac{1}{N+1} [u^N] \left( -2u^{kh} (1+u)^{2N} + kh(1-u)u^{kh-1} (1+u)^{2N+1} \right) \\ &= \frac{-2}{N+1} [u^{N-kh}] (1+u)^{2N} + \frac{kh}{N+1} [u^{N+1-kh}] (1-u)(1+u)^{2N+1} \\ &= \frac{-2}{N+1} \binom{2N}{N-kh} + \frac{kh}{N+1} \left( \binom{2N+1}{N+1-kh} - \binom{2N+1}{N-kh} \right) \\ &= \frac{-2}{N+1} \binom{2N}{N-kh} + \frac{kh}{N+1} \left( \binom{2N}{N+1-kh} - \binom{2N}{N-1-kh} \right) \\ &= \binom{2N}{N+1-kh} - 2\binom{2N}{N-kh} + \binom{2N}{N-1-kh}. \end{align*}

The last step (the formula from the book) is a result of intense algebraic gymnastics behind scene; the absorption identityarrow-up-right is helpful for grouping terms. Using a computer algebra system it’s easy to show that the last two lines are equal. All in all, putting it back into the sum over kk produces the equation of Theorem 6.9.

circle-info

The background about that last line, as published in the book, is available in the presentation Analyses of Tree Heightarrow-up-right (slides 6-7). The whole expression is nicely captured in terms of the second-order central difference.arrow-up-right

Exercise 6.48

We need to rigorously justify the transition from the discrete binomial sum to the continuous H(x)H(x) integral, specifically by managing the error bounds using Theorem 4.9 and Euler-Maclaurin summation. Of course, the proof sketch provided in the book still remains the main scaffolding, so we fill the gaps here.

To keep things aligned with Theorem 4.9, let’s define the central region as 1h<2N1+ϵ1 \le h < \sqrt{2N^{1+\epsilon}} for a small constant ϵ>0\epsilon>0. Notice that it strictly covers the book's range 1hNlogN1 \le h \le \sqrt{N}\log N in the proof of the corollary of Theorem 6.9.

As pointed out in the previous exercise, the sum denoting GNGN[h1]G_N-G_N^{[h-1]} is fundamentally adding up discrete curvatures of binomials at various distances (using step size δ=kh\delta=kh). Inside the central region, according to Theorem 4.6, we can efficiently employ the normal approximation of the binomial distribution. Thus, If a binomial coefficient acts like a Gaussian curve, then its discrete second difference (our term in the sum) must act like the continuous second derivative of that Gaussian curve. So,

ddδeδ2/N=(2δN)eδ2/Nd2dδ2eδ2/N=(4δ2N22N)eδ2/N=1N(4k2h2N2)ek2h2/N.\begin{align*} \frac{d}{d\delta} e^{-\delta^2/N} &= \left(\frac{-2\delta}{N}\right) e^{-\delta^2/N} \\ \frac{d^2}{d\delta^2} e^{-\delta^2/N} &= \left( \frac{4\delta^2}{N^2} - \frac{2}{N} \right) e^{-\delta^2/N} \\ &=\frac{1}{N} \left( \frac{4k^2h^2}{N} - 2 \right) e^{-k^2h^2/N}. \end{align*}
circle-info

The 1/N1/N factor gets "annihilated" due to division by GNG_N when moving into the realm of Catalan sums. Notice that H(h/N)H(h/\sqrt N) encompasses the expression we just derived.

Because we are swapping discrete binomials with the smooth function H(x)H(x), we introduce a relative error. Specifically, Theorem 4.9 tells us the approximation is valid up to a factor of

(1+O(1N12ϵ)).\left(1 + O\left(\frac{1}{N^{1-2\epsilon}}\right)\right).

Let's bound the total error introduced across this entire central region

Δcentral=1h2N1+ϵH(hN)O(N1+2ϵ)=O(N)O(N1+2ϵ)=O(N1/2+2ϵ).\Delta_{\text{central}} = \sum_{1 \le h \le \sqrt{2N^{1+\epsilon}}} H\left(\frac{h}{\sqrt{N}}\right) \cdot O\left(N^{-1+2\epsilon}\right)= O(\sqrt{N}) \cdot O\left(N^{-1+2\epsilon}\right) = O\left(N^{-1/2+2\epsilon}\right).

We know from the book, that H(x)H(x) is bounded and decays (see also the next exercise), hence we can approximate the sum H(h/N)\sum H(h/\sqrt{N}) by its integral, which scales as O(N)O(\sqrt{N}) (proven below). Picking ϵ<1/4\epsilon<1/4 the error turns out to be o(1)o(1).

Outside of the central region, according to Theorem 4.9, the tail is bounded as O(e(2N)ϵ)O(e^{(-2N)^\epsilon}). This is exponentially small, meaning the tail sum is entirely negligible. We have now rigorously reduced the expected height to

E[HN]=h=1H(hN)+o(1)E[H_N] = \sum_{h=1}^\infty H\left(\frac{h}{\sqrt{N}}\right) + o(1)

To transition to the integral, we apply the Euler-Maclaurin formula for a function f(h)=H(h/N)f(h) = H(h/\sqrt{N}). We can apply the second form of Theorem 4.3 over the interval [1,M][1,M] and let MM \to \infty. The usage of MM prevents overlap with NN (number of nodes).

h=1Mf(h)=1Mf(h)dh+12f(M)+Cf+R0.\sum_{h=1}^M f(h) = \int_1^M f(h) dh + \frac{1}{2}f(M) + C_f + R_0.

Because H(x)H(x) and all its derivatives decay to zero as xx \to \infty, we can safely take the limit as MM \to \infty. This gives

h=1f(h)=1f(h)dh+Cf.\sum_{h=1}^\infty f(h) = \int_1^\infty f(h) dh + C_f.

To evaluate the integral, we temporarily reuse the result from the book, so

1f(h)dh=N1/NH(x)dx=N(0H(x)dx01/NH(x)dx)=πN+O(1)(H(x)<=1 for all x).\begin{align*} \int_1^\infty f(h) dh &= \sqrt{N} \int_{1/\sqrt{N}}^\infty H(x) dx \\ &= \sqrt{N} \left( \int_0^\infty H(x) dx - \int_0^{1/\sqrt{N}} H(x) dx \right) \\ &= \sqrt{\pi N}+ O(1) && \text{($|H(x)|<=1$ for all $x$)}. \end{align*}

We must explicitly evaluate the integral 0H(x)dx\int_0^\infty H(x) dx. The trap is that we cannot simply swap the sum and the integral because the sum is not absolutely integrable over (0,)(0, \infty). A term-by-term integration produces zero for every kk. We should use the Theta functionarrow-up-right θ1(x)=k=1ek2x2\theta_1(x) = \sum_{k=1}^\infty e^{-k^2x^2}. We have

ddx[xk=1ek2x2]=k=1ek2x2k=12k2x2ek2x22ddx[xk=1ek2x2]=2k=1ek2x2+k=14k2x2ek2x2=H(x)0H(x)dx=2[xθ1(x)]0=2xθ1(x)x0=π.\begin{align*} \frac{d}{dx} \left[ x \sum_{k=1}^\infty e^{-k^2x^2} \right] &= \sum_{k=1}^\infty e^{-k^2x^2} - \sum_{k=1}^\infty 2k^2x^2 e^{-k^2x^2} \\ -2\frac{d}{dx} \left[ x \sum_{k=1}^\infty e^{-k^2x^2} \right] &= -2\sum_{k=1}^\infty e^{-k^2x^2} + \sum_{k=1}^\infty 4k^2x^2 e^{-k^2x^2} = H(x) \\ &\therefore \int_0^\infty H(x) dx = -2 \Big[ x \theta_1(x) \Big]_0^\infty=2x \theta_1(x) \Big|_{x \to 0} = \sqrt{\pi}. \end{align*}

The constant associated with the function is

Cf=12f(1)+1({x}1/2)f(x)dxO(1)+121f(x)dx=O(1)+f()f(1)=O(1).\begin{align*} C_f &= \frac{1}{2}f(1) + \int_1^\infty (\{x\} - 1/2) f'(x) dx \\ &\le O(1)+\frac{1}{2} \int_1^\infty |f'(x)| dx \\ &= O(1)+|f(\infty)-f(1)| \\ &= O(1). \end{align*}

Consequently, the average height is indeed πN+O(1)\sqrt{\pi N} + O(1).

Another Combinatorial Proof

Instead of directly handling the error terms in the original approach, it’s instructive to read the paper The Average Height of Catalan Trees by Counting Lattice Pathsarrow-up-right that provides a combinatorial proof better suited for computer scientists. It builds upon the material from the book and cites results from the authors of the book.

circle-info

As a side effect, it also achieves the intended goal of Exercise 9 by establishing a bijection between Dyck paths (related to gambler’s ruin paths) and general Catalan trees.

Exercise 6.49

Exercise 6.50

Because of the rotation correspondence, it turns out that the stack height of binary Catalan trees is essentially distributed like the height of forests. The height of a forest is the maximum height of its constituent trees. The book already establishes the relationship between the stack height of binary and general Catalan trees.

Note, the relationship is not a point-wise equality. For example, a left‑chain binary tree with NN nodes has s(t)=0s(t)=0 but the corresponding forest height is N1N-1. Nevertheless, the two quantities are asymptotically equivalent in distribution, as highlighted above.

circle-info

The stack size, as specified, is related to the Strahler numberarrow-up-right, which measures branching complexity.

Exercise 6.51

We have z=f(u)=u/(1u)z=f(u)=u/(1-u), thus the first case applies

[zn]A(z)=1n[un1](uf(u))n=1n[un1](1u)n=1n(1)n1n=(1)n1(for n1).\begin{align*} [z^n]A(z) &= \frac{1}{n}[u^{n−1}]\left(\frac{u}{f(u)}\right)^n \\ &= \frac{1}{n}[u^{n−1}](1-u)^n \\ &= \frac{1}{n}(-1)^{n-1}n \\ &= (-1)^{n-1} \qquad \text{(for $n \ge 1$)}. \end{align*}

Exercise 6.52

Let f(z)=ez1f(z)=e^z-1, so f(f1(z))=ef1(z)1=zf(f^{-1}(z))=e^{f^{-1}(z)}-1=z. This implies that f1(z)=ln(z+1)f^{-1}(z)=\ln(z+1). Its Taylor expansion is

ln(1+z)=n1(1)n1nzn.\ln(1+z) = \sum_{n\ge1} \frac{(-1)^{n-1}}{n} z^n.

Let’s apply the Lagrange inversion theorem

[zn]ln(1+z)=1n[un1](uf(u))n=1n[un1](ueu1)n.[z^n] \ln(1+z) = \frac{1}{n} [u^{n-1}] \left( \frac{u}{f(u)} \right)^n= \frac{1}{n} [u^{n-1}] \left( \frac{u}{e^u - 1} \right)^n.

Handling the RHS is really messy (see Exercise 3.36), as we need to raise a convoluted power series to nnth power. Nonetheless, we can put the theorem into a "reverse" regime and get a stunning combinatorial identity

[un1](ueu1)n=(1)n1.[u^{n-1}] \left( \frac{u}{e^u - 1} \right)^n = (-1)^{n-1}.

This is akin to factoring GFs to derive new identities (see Exercise 3.5). Compare also this result with the one from the previous exercise.

Exercise 6.53

See Exercise 72 for the generalization of this problem.

Exercise 6.54

See Theorem 6.15 in the book with t=4t=4.

Exercise 6.55

This is known as Cayley’s formulaarrow-up-right.

Exercise 6.56

The book Introduction to Algorithms Fourth Editionarrow-up-right, by CLRS (see Section B.5.1 Free trees) contains a full proof using an extended list (6 instead of 4) of equivalent properties. Furthermore, they apply a clever trick to reduce the combinatorial explosion in the proof. Namely, it’s enough to prove just one cycle between these properties, that one implies another in a cycle, reducing the required work.

Exercise 6.57

The postarrow-up-right on StackExchange for Mathematics lists all different free trees with 6 nodes. The distribution of ordered trees onto these in left-to-right order is 2, 10, 10, 5, 10, 5. Figure 6.22 shows that the winner is the tree structure known as the Broom graph B5,3B_{5,3}. The same type of graph structure B6,3B_{6,3} also appears most frequently among all ordered trees on six nodes (it’s the second one from left in the linked post), although not uniquely. Nonetheless, for large NN Broom graphs actually perform poorly (see the next exercise).

At any rate, even for the cases with 5 and 6 nodes, highly symmetric trees perform badly in this respect. For example, the star graph is always associated with only 2 ordered trees.

Exercise 6.58

Imagine a central node connected to

  • a single leaf,

  • a chain of 2 nodes,

  • and a chain of 3 nodes.

This free tree has N=7N=7 nodes in total. It has maximum asymmetry achievable with a tree of this size. If we promote the central node as a root, we immediately produce 3! different ordered trees. If this central node becomes a child of some other root, then it again contributes 2! variants in any context. A Broom graph is no match for this version.

The key takeaway is to maximize asymmetry within the structure. An effective approach involves beginning with a central node accompanied by subtrees that are structurally distinct, applying this principle recursively throughout each subtree. It’s essential to balance this objective against the number of available nodes. While positioning a high-degree node as the root can generate numerous ordered trees, it is also important to avoid identical subtrees, as these diminish the overall count.

Exercise 6.59

Assume tree denotes an ordered tree, based on the following quote from the book:

But we maintain our algorithmic bias, by reserving the word tree for ordered trees, which arise in perhaps the most natural way in computer applications.

The description below is a linear time algorithm, as a function of number of edges.

1

Building a Tree Representation

Read an input consisting of a set of edges and produce an adjacency listarrow-up-right representation of the tree. Find the root (the node with in-degree 0 assuming edges are directed). Because an adjacency list has no strict left-to-right order, the "ordered" nature of the resulting tree will depend on the order edges are read. The list data structure preserves the order of how items were added to it.

2

Traversing the Tree

Do a preorder traversal of the tree as described in Section 6.3 pertaining to parenthesis systems.

Exercise 6.60

After executing the algorithm from the previous exercise, we would immediatelly get the matching binary representation of the tree. Recall from Exercise 8 that a preorder traversal of an ordered tree aligns perfectly with the same traversal of the matching binary tree due to rotation correspondence.

Exercise 6.61

For binary trees there are various streamlined approaches. The LeetCode problem Flip Equivalent BInary Treesarrow-up-right (the editorial is publicly available) provides an environment where this algorithm can be implemented and tested. Below is the solution, which illustrates the idea behind the generic AHU-algorithmarrow-up-right. It differs from the third approach of LeetCode’s official solution by preserving the original trees.

circle-info

The Python3 code is based on the pseudo-code of the AHU-algorithm, as given in the cited paper. To keep things simple, the step to condense labels is skipped, which is important to retain linear time performance. Finally, in this LeetCode problem nodes already have distinct values. In general, you would need to assign IDs based on unique subtree structures (see the cited paper).

Exercise 6.62

See the previous exercise about the AHU-algorithm. The cited paper there is all about explaining in detail the algorithm using pseduo-code and it also provides proofs of correctness.

Exercise 6.63

The book suggests building the series step-by-step by iterating the functional equation

U[h+1](z)=zexp{k=1h+11kU[h](zk)}(for h0),U^{[h+1]}(z) = z \exp\left\{ \sum_{k=1}^{h+1} \frac{1}{k} U^{[h]}(z^k) \right\} \qquad \text{(for $h \ge 0$)},

with U[0](z)=zU^{[0]}(z)=z. Because both series agree to h+1h + 1 terms, we can just treat everything as bounded polynomials and update them layer by layer. The next Python program implements the naive version without using any external libraries. Its running time is O(N4)O(N^4) arithmetic operations. It’s intentionally kept in this form to truly depict what is going on under the hood. A simple optimization, elaborated at the beginning of the next exercise, may bring down the running time to O(N3)O(N^3) even for this basic approach.

Python 3 by default handles arbitrarily large integers, so any number of coefficients may be printed. Of course, the bad performance characteristics of this version precludes large NN.

circle-info

There are lots of relatively novel implementations in various programming languages to compute UNU_N. These are documented in OEIS A000081arrow-up-right. Among them there is a one liner program in SageMath.

Exercise 6.64

The underlying idea of this optimized version is best understood by looking at a more mundane task of speeding up the poly_exp function (see the previous exercise). It evaluates the Taylor expansion of exe^x by computing each term from scratch. Overall this entails O(N3)O(N^3) operations. Let’s see how to reduce this to O(N2)O(N^2). Let E(z)=exp(P(z))E(z) = \exp(P(z)). Taking the derivative of both sides, we get a differential equation

E(z)=E(z)P(z).E'(z) = E(z) P'(z).

Equating the coefficients of zN1z^{N-1} gives

NEN=k=1NkPkENk.N E_N = \sum_{k=1}^N k P_k E_{N-k}.

Thus, computing each coefficient demands only O(N)O(N) instead of O(N2)O(N^2) arithmetic operations. The Harary-Palmer method differentiates the entire functional equation to get everything down to O(N2)O(N^2). This gives

U(z)=zexp{k11kU(zk)}lnU(z)=lnz+k11kU(zk)U(z)U(z)=1z+k11kU(zk)kzk1=1z+k1U(zk)zk1zU(z)=U(z)+U(z)k1zkU(zk)N1NUNzN=N1UNzN+(i1Uizi)(k1l1lUlzkl).\begin{align*} U(z) &= z \exp\left\{ \sum_{k\ge1} \frac{1}{k} U(z^k) \right\} \\ \ln U(z) &= \ln z + \sum_{k\ge1} \frac{1}{k} U(z^k) \\ \frac{U'(z)}{U(z)} &= \frac{1}{z} + \sum_{k\ge1} \frac{1}{k} U'(z^k) \cdot k z^{k-1} \\ &= \frac{1}{z} + \sum_{k\ge1} U'(z^k) z^{k-1} \\ z U'(z) &= U(z) + U(z) \sum_{k\ge1} z^k U'(z^k) \\ \sum_{N\ge1} N U_N z^N &= \sum_{N\ge1} U_N z^N + \left( \sum_{i\ge1} U_i z^i \right) \left( \sum_{k\ge1} \sum_{l\ge1} l U_l z^{kl} \right). \end{align*}

We want to extract the coefficient of zN+1z^{N+1}. On the RHS we must have i+kl=N+1i+kl=N+1. Because i1i \ge 1 this implies that klNkl\le N. Furthermore, l1l \ge 1 means that 1kN1 \le k \le N. We get

(N+1)UN+1=UN+1+klNkUkUN+1klNUN+1=klNkUkUN+1klNUN+1=1kN(kUkkklNUN+1kl)UN+1=1N1kNUN+1k(dkdUd).\begin{align*} (N+1) U_{N+1} &= U_{N+1} + \sum_{kl \le N} k U_k U_{N+1-kl} \\ N U_{N+1} &= \sum_{kl \le N} k U_k U_{N+1-kl} \\ N U_{N+1} &= \boxed{\sum_{1 \le k \le N} \left( k U_k \sum_{k \le kl \le N} U_{N+1-kl} \right)} \\ &\therefore U_{N+1} = \frac{1}{N} \sum_{1 \le k \le N} U_{N+1-k} \left( \sum_{d|k} d U_d \right). \end{align*}

The final equation is the form used in the efficient Python implementation shown below. It’s also the direct application of the Euler transformarrow-up-right (see the third case) on U(z)U(z). The boxed formula is the one we need to show as part of this exercise. Using memoization we can achieve the O(N2)O(N^2) objective. Notice, that handling the inner divisor sum requires O(NlogN)O(N \log N) operations during the whole run.

circle-exclamation

Exercise 6.65

See DigraphGenerator.javaarrow-up-right.

Exercise 6.66

Each free tree gives at least one rooted tree (by picking any node as root), so there are at least as many rooted trees of NN nodes as free trees of NN nodes. There are NN choices for picking a root, so there can be at most NN times as many rooted trees of NN nodes as free trees of NN nodes.

Exercise 6.67

The derivation is based on a combinatorial argument expressed in terms of OGFs. The starting point is Otter’s theoremarrow-up-right "In any tree the number of nonsimilar vertices minus the number of nonsimilar lines (symmetry line excepted) is the number one." Therefore, the formula for any free tree is

1=(distinct node roots)(distinct edge roots except symmetric edges).1 = (\text{distinct node roots}) - (\text{distinct edge roots except symmetric edges}).

If we sum this property over all free trees, the LHS transforms immediately into F(z)F(z). On the RHS, the first term is simply U(z)U(z), by definition of rooted trees.

Every undirected edge contains two rooted trees on its endpoints. The OGF for a sequence of 2 rooted trees is U(z)2U(z)^2. Nonetheless, this also includes the symmetric case, when both trees are the same. We must add this amount back. Furthermore, we don’t care about order, so this must be divided by two.

The symmetric case counts the number of even sized rooted trees, hence its OGF is U(z2).U(z^2). Taking into account the previous remarks, we end up with the equation

F(z)=U(z)12U(z)2+12U(z2).F(z) = U(z) - \frac{1}{2}U(z)^2 + \frac{1}{2}U(z^2).

Exercise 6.68

We want to find

FN=[zN](U(z)12U(z)2+12U(z2)).F_N = [z^N] \left( U(z) - \frac{1}{2}U(z)^2 + \frac{1}{2}U(z^2) \right).

We know from the book that [zN]U(z)=UNcαNN3/2[z^N] U(z) = U_N \sim c \alpha^N N^{-3/2}. The symmetric term [zN]U(z2)UN/2[z^N] U(z^2) \sim U_{N/2} is exponentially small relative to UNU_N (see Exercise 4.9). We are left with estimating the discrete convolution

FNUN12k=1N1UkUNk.F_N \approx U_N - \frac{1}{2} \sum_{k=1}^{N-1} U_k U_{N-k}.

Let’s partition the sum into three distinct regions using an intermediate boundary AA (such as A=NA = \sqrt{N}), where 1AN1 \ll A \ll N:

  • The Left Tail: k[1,A]k \in [1, A]

  • The Right Tail: k[NA,N1]k \in [N-A, N-1]

  • The Central Region: k(A,NA)k \in (A, N-A)

Due to symmetry, the left and right tails are the same.

The Left Tail and the Residual

In the left tail, kk is small relative to NN. We use the binomial expansion to approximate the (Nk)3/2(N-k)^{-3/2} term in UNkU_{N-k}

(Nk)3/2=N3/2(1kN)3/2N3/2(1+3k2N+O((k/N)2)).(N-k)^{-3/2} = N^{-3/2}\left(1 - \frac{k}{N}\right)^{-3/2} \approx N^{-3/2}\left(1 + \frac{3k}{2N}+O((k/N)^2)\right).

Substituting this into the left tail of the convolution (ignoring the last O-term)

Left Tailk=1AUk(cαNkN3/2(1+3k2N))=UNk=1AUkαk+3UN2Nk=1AkUkαk.\text{Left Tail} \approx \sum_{k=1}^{A} U_k \left( c \alpha^{N-k} N^{-3/2} \left(1 + \frac{3k}{2N}\right) \right) = U_N \sum_{k=1}^{A} U_k \alpha^{-k} + \frac{3 U_N}{2N} \sum_{k=1}^{A} k U_k \alpha^{-k}.

To evaluate the first sum, we use the technique of bounding the tail from Section 4.4 of the book. For this, we must peek a little bit into singularity analysis in the book Analytic Combinatorics by PF & RS. The following lemma is important:

Lemma: Let ρ=1/α\rho = 1/\alpha be the radius of convergence of U(z)U(z). Then, U(ρ)=1U(\rho) = 1.

This gives

k=1AUkαk=1Acx3/2dx=12cA1/2.\sum_{k=1}^{A} U_k \alpha^{-k} = 1 - \int_{A}^{\infty} c x^{-3/2} \, dx = 1 - 2c A^{-1/2}.

Thus, the left tail evaluates to

Left TailUN2cUNA+3cUN2N(2A+Cf).\text{Left Tail} \approx U_N - \frac{2c U_N}{\sqrt{A}} + \frac{3 cU_N}{2N} \left( 2\sqrt{A} + C_f \right).

The last summand is the result of evaluating the second sum in the left tail by employing the Euler-Maclaurin method. CfC_f is the usual constant in this approach.

The Central Region

In the central region, both kk and NkN-k are large, allowing us to safely replace both with their continuous asymptotic equivalents and approximate the sum as an integral

CenterANA(cαxx3/2)(cαNx(Nx)3/2)dx=c2αNANAx3/2(Nx)3/2dx=c2αN4(2(NA)N)N2(NA)A(F(A)=F(NA))=c2αN4(N2A)N2ANA=c2αN4N(12A/N)N2AN1A/N=c2αN4(12A/N)N3/2A(1AN)1/2c2αN4N3/2A(12AN)(1+A2N)((1A/N)1/21+A2N)c2αN4N3/2A(13A2N)(dropping quadratic term)=4cUNA6cUNAN.\begin{align*} \text{Center} &\approx \int_{A}^{N-A} \left( c \alpha^x x^{-3/2} \right) \left( c \alpha^{N-x} (N-x)^{-3/2} \right) \, dx \\ &= c^2 \alpha^N \int_{A}^{N-A} x^{-3/2}(N-x)^{-3/2} \, dx \\ &= c^2 \alpha^N \frac{4(2(N-A)-N)}{N^2 \sqrt{(N-A)A}} && \text{($F(A)=-F(N-A)$)}\\ &= c^2 \alpha^N \frac{4(N-2A)}{N^2 \sqrt{A} \sqrt{N-A}} \\ &= c^2 \alpha^N \frac{4N(1 - 2A/N)}{N^2 \sqrt{A} \sqrt{N} \sqrt{1 - A/N}} \\ &= c^2 \alpha^N \frac{4(1 - 2A/N)}{N^{3/2} \sqrt{A}} \left(1 - \frac{A}{N}\right)^{-1/2} \\ &\approx c^2 \alpha^N \frac{4}{N^{3/2} \sqrt{A}} \left(1 - \frac{2A}{N}\right) \left(1 + \frac{A}{2N}\right) && \text{($(1 - A/N)^{-1/2} \approx 1 + \frac{A}{2N}$)} \\ &\approx c^2 \alpha^N \frac{4}{N^{3/2} \sqrt{A}} \left(1 - \frac{3A}{2N}\right) && \text{(dropping quadratic term)} \\ &= \frac{4 c U_N}{\sqrt{A}} - \frac{6cU_N\sqrt{A}}{N}. \end{align*}

Combined Result of the Convolution

We now reassemble the total convolution by summing the three regions

UkUNk=Left Tail+Right Tail+Center=2(UN2cUNA+3cUN2N(2A+Cf))+(4cUNA6cUNAN)=2UN+O(UN/N).\begin{align*} \sum U_k U_{N-k} &= \text{Left Tail} + \text{Right Tail} + \text{Center} \\ &= 2\left( U_N - \frac{2c U_N}{\sqrt{A}} + \frac{3 cU_N}{2N} \left( 2\sqrt{A} + C_f \right)\right) + \left( \frac{4c U_N}{\sqrt{A}} - \frac{6cU_N\sqrt{A}}{N} \right) \\ &= 2U_N+O(U_N/N). \end{align*}

Conclusion

Substituting back the result of the convolution, we have

FNUN12(2UN+O(UN/N))c1αNN5/2.F_N \approx U_N - \frac{1}{2} ( 2 U_N + O(U_N/N)) \sim c_1\alpha^NN^{-5/2}.

We’ve applied the method of matched asymptotic expansionsarrow-up-right, which ensures that all artificial boundaries vanish from the final expression.

Exercise 6.69

The number of distinct labelings of an unlabeled tree TT with NN nodes is N!/Aut(T)N! / |\text{Aut}(T)|, where Aut(T)\text{Aut}(T) is the automorphism group of TT. The automorphism group Aut(T)\text{Aut}(T) of a tree TT is the set of all permutations of the nodes of TT that preserve the adjacency relation. In other words, a permutation σ\sigma of the node set is an automorphism if for any two nodes u,vu, v, the pair {u,v}\{u, v\} is an edge in TT if and only if {σ(u),σ(v)}\{\sigma(u), \sigma(v)\} is an edge in TT. To maximize the number of labelings, we minimize the size of the automorphism group.

For N=4,5,6N=4,5,6, the path (automorphism group of order 2) has the most labelings (only a sole winner when N=4N=4). For N7N \ge 7, we can always create an asymmetric tree with a trivial automorphism group producing N!N! labelings.

A concreteÍ example with N=7N=7 nodes is

Notice that every configuration is unique. One possible general structure is to have a central node with branches of different lengths.

Exercise 6.70

Wikipedia has a detailed solutionarrow-up-right as part of the Cayley’s formula.

Exercise 6.71

The equation follows from the arguments given in Exercise 67 with the symmetrical case omitted, as it doesn’t apply here.

Exercise 6.72

The OGF for kk-forests is just T[t](z)kT^{[t]}(z)^k, where T[t](z)T^{[t]}(z) is the OGF for tt-ary trees. We know from the book that z=u(1ut1)z=u(1-u^{t-1}). By Lagrange inversion (using the second case in Theorem 6.11), the number of kk-forests of tt-ary trees with a total of NN internal nodes and (t1)N+k(t-1)N+k external nodes (each tree also contributes +1 to the total) is

[z(t1)N+k]T[t](z)k=k(t1)N+k[u(t1)N](1ut1)((t1)N+k)=k(t1)N+k(tN+k1N)=ktN+k(tN+kN).\begin{align*} [z^{(t-1)N+k}]T^{[t]}(z)^k &= \frac{k}{(t-1)N+k}[u^{(t-1)N}](1-u^{t-1})^{-((t-1)N+k)} \\ &= \frac{k}{(t-1)N+k} \binom{tN+k-1}{N} \\ &= \frac{k}{tN+k} \binom{tN+k}{N}. \end{align*}

Exercise 6.73

We start with the exact formula from the book

EN=1(t1)N+1(tNN)=1(t1)N+1(tN)!N!((t1)N)!.E_N = \frac{1}{(t-1)N+1} \binom{tN}{N} = \frac{1}{(t-1)N+1} \frac{(tN)!}{N! ((t-1)N)!}.

For large NN, we can approximate the prefactor as

1(t1)N+11(t1)N.\frac{1}{(t-1)N+1} \sim \frac{1}{(t-1)N}.

Now, we apply Stirling's approximation to the three factorials in the binomial coefficient

(tNN)2πtNttNNtNetN2πNNNeN2π(t1)N(t1)(t1)NN(t1)Ne(t1)N=2πtN2πN2π(t1)NttN(t1)(t1)N=2πtN2πN2π(t1)N(tt(t1)t1)N=12πNtt1(αt)N.\begin{align*} \binom{tN}{N} &\sim \frac{\sqrt{2\pi t N} \cdot t^{tN} \cdot N^{tN} \cdot e^{-tN}}{\sqrt{2\pi N} \cdot N^N \cdot e^{-N} \cdot \sqrt{2\pi (t-1)N} \cdot (t-1)^{(t-1)N} \cdot N^{(t-1)N} \cdot e^{-(t-1)N}} \\ &= \frac{\sqrt{2\pi t N}}{\sqrt{2\pi N} \sqrt{2\pi (t-1)N}} \cdot \frac{t^{tN}}{(t-1)^{(t-1)N}} \\ &= \frac{\sqrt{2\pi t N}}{\sqrt{2\pi N} \sqrt{2\pi (t-1)N}} \cdot \left( \frac{t^t}{(t-1)^{t-1}} \right)^N \\ &= \frac{1}{\sqrt{2\pi N}} \sqrt{\frac{t}{t-1}} (\alpha_t)^N. \end{align*}

This gives

EN(1(t1)N)(12πNtt1(αt)N)=1N3/21(t1)2πtt1(αt)N=12π(t1)3t(αt)NN3/2=ct(αt)NN3/2.\begin{align*} E_N &\sim \left( \frac{1}{(t-1)N} \right) \cdot \left( \frac{1}{\sqrt{2\pi N}} \sqrt{\frac{t}{t-1}} (\alpha_t)^N \right) \\ &= \frac{1}{N^{3/2}} \cdot \frac{1}{(t-1)\sqrt{2\pi}} \sqrt{\frac{t}{t-1}} \cdot (\alpha_t)^N \\ &= \frac{1}{\sqrt{2\pi \frac{(t-1)^3}{t}}} \cdot \frac{(\alpha_t)^N}{N^{3/2}} \\ &= c_t \frac{(\alpha_t)^N}{N^{3/2}}. \end{align*}

Exercise 6.74

We start with the formula for the number of tt-restricted trees from the book

[zN]T(z)=1N[uN1](θ(u))N=1N[uN1](1ut+1)N(1u)N=1N[uN1](k=0N(Nk)(1)kuk(t+1))(j0(N+j1N1)uj).\begin{align*} [z^N]T(z) &= \frac{1}{N}[u^{N-1}](\theta(u))^N \\ &= \frac{1}{N}[u^{N-1}]\left(1-u^{t+1}\right)^N (1-u)^{-N} \\ &= \frac{1}{N}[u^{N-1}]\left(\sum_{k=0}^{N} \binom{N}{k} (-1)^k u^{k(t+1)}\right)\left(\sum_{j\ge0} \binom{N+j-1}{N-1} u^j\right). \end{align*}

We want to extract the coefficient [uN1][u^{N-1}], so only terms satisfying the following equations are relevant

k(t+1)+j=N1    j=N1k(t+1),j0    kN1t+1.k(t+1) + j = N - 1 \implies j = N - 1 - k(t+1), \\ j \ge 0 \implies k \le \frac{N-1}{t+1}.

This gives

[zN]T(z)=1Nk=0N1t+1(1)k(Nk)(2N2k(t+1)N1).[z^N]T(z) = \frac{1}{N} \sum_{k=0}^{\lfloor \frac{N-1}{t+1} \rfloor} (-1)^k \binom{N}{k} \binom{2N - 2 - k(t+1)}{N-1}.

Exercise 6.75

The Python 3 program below implements in code the formula derived in the previous exercise.

circle-info

The default configuration prints the Motzkin numbers on the output. They match the OEIS A001006arrow-up-right sequence.

Exercise 6.76

The customized Theorem 6.16 uses the next θ\theta function

θeven(u)=1+u2+u4++u2r=1u2(r+1)1u2.\theta_{\text{even}}(u)=1+u^2+u^4+\dots+u^{2r}= \dfrac{1-u^{2(r+1)}}{1-u^2}.

The RHS follows by reusing the idea from Exercise 74. Because the sum of children counts (i.e. the total number of edges) is even, the total number of nodes N=edges+1=2k+1N = \text{edges}+1=2k+1 must be odd. The Lagrange inversion formula gives

[zN]Teven(z)=12k+1[u2k](θeven(u))2k+1.[z^N]T_{even}(z)=\frac{1}{2k+1}\,[u^{2k}] \bigl(\theta_{\text{even}}(u)\bigr)^{2k+1}.

Let’s use the substitution v=u2v=u^2, so

(θeven(u))2k+1=(1v)(2k+1)(1vr+1)2k+1=i=02k+1(1)i(2k+1i)vi(r+1)j0(2k+jj)vj.\begin{align*} \bigl(\theta_{\text{even}}(u)\bigr)^{2k+1} &= (1-v)^{-(2k+1)} (1-v^{r+1})^{2k+1} \\ &= \sum_{i=0}^{2k+1} (-1)^i \binom{2k+1}{i} v^{i(r+1)} \sum_{j\ge 0} \binom{2k+j}{j} v^j. \end{align*}

We want to extract the coefficient [u2k]=[vk][u^{2k}]=[v^k], so only terms satisfying the following equations are relevant

i(r+1)+j=k    j=ki(r+1),j0    ikr+1.i(r+1) +j=k \implies j =k- i(r+1), \\ j \ge 0 \implies i \le \frac{k}{r+1}.

This gives

[zN]Teven(z)=12k+1i=0k/(r+1)(1)i(2k+1i)(3ki(r+1)2k)=12k+1i=0k/(t+1)/2(1)i(2k+1i)(3ki(t+1)/22k).\begin{align*} [z^N]T_{even}(z) &= \frac{1}{2k+1} \sum_{i=0}^{\lfloor k/(r+1)\rfloor} (-1)^i \binom{2k+1}{i} \binom{3k - i(r+1)}{2k} \\ &= \boxed{\frac{1}{2k+1} \sum_{i=0}^{\lfloor k / \lfloor (t+1)/2 \rfloor \rfloor} (-1)^i \binom{2k+1}{i} \binom{3k - i\lfloor (t+1)/2 \rfloor}{2k}}. \end{align*}

Exercise 6.77

Balanced trees (AVL and 2-3) have definitely a smaller cardinality than unbalanced trees (3-ary and 3-restricted).

AVL has only a local height constraint whilst 2-3 has a global condition (all external nodes must be at the same height). Nonetheless, AVL is a binary tree, which turns out to be a more limiting factor.

The 3-ary trees have more possibilities, since "empty" links do contribute to the combinatorial count.

Therefore the list of trees in increasing order of their cardinality for large NN: AVL, 2-3, 3-restricted and 3-ary.

Exercise 6.78

Assume NN denotes the number of internal nodes.

Both trees show oscillations, especially the 2-3 variant. Some cases aren’t even possible. This shows how global constraints (2-3 depth) versus local constraints (AVL balance) completely rewrite the asymptotic behavior of the generating functions.

Exercise 6.79

We can apply the symbolic method in both cases. Notice the usage of the Iverson bracketarrow-up-right operator.

An AVL tree of height hh is either empty or has the root attached to perfectly balanced subtrees of height h1h-1 or attached to subtrees with heights h1h-1 and h2h-2 (either left or right child is taller).

Ah=[h0]ϵ+[h>0]Z×(2Ah1×Ah2+Ah12).A_h=[h \le 0]\epsilon+[h > 0]Z \times (2A_{h-1} \times A_{h-2} + A_{h-1}^2).

To specify the functional equation on the generating function of 2-3 trees, we must use the substitution operatorarrow-up-right of the symbolic method. Based on the definition from the book, each internal node is either a 2 node (doubles the number of external nodes) or 3 node (triples the number of external nodes). Furthermore, all external nodes must be at the same level. Consequently, the total number of external nodes equals the product of multiplicative factors of internal nodes.

Bh=[h=0]Z+[h>0]Bh1(Z2+Z3).B_h=[h=0]Z+[h>0]B_{h-1} \circ (Z^2+Z^3).

Notice that substituting atoms of Bh1B_{h-1} with Z2Z^2 or Z3Z^3 exactly incorporates those possible multplicative factors of internal nodes.

Last updated