> For the complete documentation index, see [llms.txt](https://evarga.gitbook.io/sh-aofa/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://evarga.gitbook.io/sh-aofa/algorithms-and-combinatorial-structures/chapter-9-words-and-mappings.md).

# Chapter 9: Words and Mappings

## Exercise 9.1

Imagine placing $$N$$ labeled balls and $$M-1$$ distinct sticks in a row. The total number of arrangements is $$(N+M-1)!$$. Sticks are markers for urns. For example, `|21||3` denotes a configuration where urns 0 and 2 are empty, urn 1 contains balls 2 and 1 (in this specific order) and urn 3 contains 3. Now, in our case both the balls and the sticks are indistinguishable. Therefore,

$$
C\_{MN}=\frac{(N+M-1)!}{N!(M-1)!}=\binom{N+M-1}{M-1}.
$$

{% hint style="info" %}
The general framework is known as the [stars and bars](https://en.wikipedia.org/wiki/Stars_and_bars_\(combinatorics\)) method.
{% endhint %}

## Exercise 9.2

{% code title="Occupancy distribution and properties of 4-words of length 3" expandable="true" %}

```
word  0 1 2 3    word  0 1 2 3    word  0 1 2 3    word  0 1 2 3
111   3 0 0 1    211   2 1 1 0    311   2 1 1 0    411   2 1 1 0
112   2 1 1 0    212   2 1 1 0    312   1 3 0 0    412   1 3 0 0
113   2 1 1 0    213   1 3 0 0    313   2 1 1 0    413   1 3 0 0
114   2 1 1 0    214   1 3 0 0    314   1 3 0 0    414   2 1 1 0
121   2 1 1 0    221   2 1 1 0    321   1 3 0 0    421   1 3 0 0
122   2 1 1 0    222   3 0 0 1    322   2 1 1 0    422   2 1 1 0
123   1 3 0 0    223   2 1 1 0    323   2 1 1 0    423   1 3 0 0
124   1 3 0 0    224   2 1 1 0    324   1 3 0 0    424   2 1 1 0
131   2 1 1 0    231   1 3 0 0    331   2 1 1 0    431   1 3 0 0
132   1 3 0 0    232   2 1 1 0    332   2 1 1 0    432   1 3 0 0
133   2 1 1 0    233   2 1 1 0    333   3 0 0 1    433   2 1 1 0
134   1 3 0 0    234   1 3 0 0    334   2 1 1 0    434   2 1 1 0
141   2 1 1 0    241   1 3 0 0    341   1 3 0 0    441   2 1 1 0
142   1 3 0 0    242   2 1 1 0    342   1 3 0 0    442   2 1 1 0
143   1 3 0 0    243   1 3 0 0    343   2 1 1 0    443   2 1 1 0
144   2 1 1 0    244   2 1 1 0    344   2 1 1 0    444   3 0 0 1
total 108 108 36 4

Occupancy distributions
         4 x 3001 + 36 x 2110 + 24 x 1300

           Pr{no urn empty}                                              = 0
    Pr{urn occupancies all < 2}                                    24/64 ≈ 0.375
    Pr{urn occupancies all < 3}                             (36 + 24)/64 ≈ 0.938
     Avg. number empty urns                       (3*4 + 2*36 + 1*24)/64 ≈ 1.688
     Avg. maximum occupancy                       (3*4 + 2*36 + 1*24)/64 ≈ 1.688
             Avg. occupancy                (1*108 + 2*36 + 3*4)/(4 * 64) ≈ 0.750
     Avg. minimum occupancy                                              = 0
```

{% endcode %}

## Exercise 9.3

{% code title="Occupancy parameters for balls in two urns" expandable="true" %}

```
                        balls ->    1      2      3      4      5      6      7      8      9     10
Probability
      occupancies all < 2           1   .500      0      0      0      0      0      0      0      0
      occupancies all < 3           1      1   .750   .375      0      0      0      0      0      0
      occupancies all < 4           1      1      1   .875   .625   .313      0      0      0      0
      occupancies all > 0           0   .500   .750   .875   .938   .969   .984   .992   .996   .998
      occupancies all > 1           0      0      0   .375   .625   .781   .875   .930   .961   .979
      occupancies all > 2           0      0      0      0      0   .313   .547   .711   .820   .891

Average
             # empty urns           1   .500   .250   .125   .063   .031   .016   .008   .004   .002
           max. occupancy           1   1.50   2.25   2.75   3.44   3.94   4.59   5.09   5.73   6.23
           min. occupancy           0   .500   .750   1.25   1.56   2.06   2.41   2.91   3.27   3.77
```

{% endcode %}

## Exercise 9.4

In the previous exercise, we’ve already seen an instance of this, when the number of balls and urns equals $$N=M=2$$. It turns out, that it’s also the necessary and sufficient condition (assuming $$N,M\ge 1$$) for the average number of empty urns to equal the average minimum urn occupancy.

Let $$X$$ be the average number of empty urns and $$Y$$ be the average minimum urn occupancy. Since the probability of any single urn being empty is $$(1 - \frac{1}{M})^N$$, by linearity of expectation, the expected number of empty urns is

$$
X = M \left(1 - \frac{1}{M}\right)^N.
$$

The average minimum occupancy can be computed using the tail sum formula of expectation

$$
Y = \sum\_{k=1}^{\lfloor N/M \rfloor} \Pr{\min \ge k}.
$$

We are looking for conditions where $$X=Y$$.

If there are fewer balls than urns ($$N\<M$$), at least one urn must be empty. Consequently, $$X>0$$ and $$Y=0$$, which means this situation cannot result in these quantities to be equal.

The case $$N = M \implies \lfloor N/M \rfloor=1$$, so $$Y$$ simplifies to just the probability that every urn has exactly 1 ball; it’s the number of favorable permutations over the total configurations $$Y = \frac{M!}{M^M}$$. This gives

$$
X = Y \implies M \left(1 - \frac{1}{M}\right)^M = \frac{M!}{M^M} \iff (M - 1)^M = (M - 1)! \implies M=2 \quad \checkmark
$$

For any $$M \ge 3$$, the LHS definitely exceeds the RHS.

Finally, we must check the case $$N>M$$. Here, the expected number of empty urns decays exponentially toward 0, while the expected minimum occupancy grows toward roughly $$N/M$$. Because their trajectories are fundamentally different in the limit, they only intersect once. It can be shown that no integer solution exists for this intersection point, although the proof itself is out of scope of this manual.&#x20;

## Exercise 9.5

According to the estimate, it needs $$N \sim \sqrt {2 \times365\times\ln 100} \approx 58$$ people, which slightly overestimates the true answer $$N=57$$ (acquired using a quick computer calculation).

## Exercise 9.6 🌟

{% hint style="success" %}
This exercise connects the expectation of the Ramanujan Q-function with the Rayleigh distribution. Furthermore, it presents a generalized formula to compute expectation of a random variable raised to $$n$$th power.
{% endhint %}

The variance is defined as $$Var(X) = \mathbb{E}\[X^2] - (\mathbb{E}\[X])^2$$. We already have the asymptotic expected value (the mean) from Theorem 9.1

$$
\mathbb{E}\[X] \sim \sqrt{\frac{\pi M}{2}} \implies ( \mathbb{E}\[X])^2 \sim \frac{\pi M}{2}.
$$

The book *A First Course in Probability* by Sheldon Ross provides a handy formula to compute $$\mathbb{E}\[X^2]$$. Its generalized form is

$$
\mathbb{E}\[X^n] = \int\_0^\infty n x^{n-1} P(X > x) dx.
$$

Based on Theorem 4.4 (Ramanujan Q-distribution), the probability of having no collisions after $$x$$ throws can be estimated as

$$
\Pr{X > x} \sim e^{\frac{-x^2}{2M}}.
$$

This gives

$$
\mathbb{E}\[X^2] \sim \int\_{0}^{\infty} 2x e^{\frac{-x^2}{2M}} dx=\int\_{0}^{-\infty} -2M e^u du =  2M.
$$

After plugging back the pieces into the variance formula we get

$$
\boxed{Var(X) = M \left(2 - \frac{\pi}{2}\right)}.
$$

The standard deviation is

$$
\sigma = \sqrt{Var(X)} = \sqrt{M\left(2 - \frac{\pi}{2}\right)} \approx 0.655 \sqrt{M}.
$$

Apparently, it scales at the exact same rate as the mean and the median. The limit distribution of $$X/\sqrt{M}$$ is the [Rayleigh distribution](https://en.wikipedia.org/wiki/Rayleigh_distribution), which is **right‑skewed**. In any such distribution the long right tail pulls the mean to the right of the median. Because the shape of the distribution does not change as $$M$$ grows—it simply stretches—the proportional difference between the mean and median remains constant. This is a fundamental property of the Rayleigh limit.

## Exercise 9.7

{% code title="List of 2-collections and 3-collections from Table 9.2" expandable="true" %}

```
2-collections             3-collections
1112                      1123
1113                      1132
2221                      1213
2223                      1223
3331                      1312
3332                      1332
                          2113
                          2123
                          2213
                          2231
                          2321
                          2331
                          3112
                          3132
                          3221
                          3231
                          3321
                          3312 
```

{% endcode %}

$$
P\_{32}(z)=\frac{6z^2}{1-z} \implies \[z^N]P\_{32}(z)=6 \quad \text{ for $N\ge2$}, \\
P\_{33}(z)=\frac{6z^3}{(1-z)(1-2z)} \implies \[z^N]P\_{33}(z)=6(2^{N-2}-1) \quad \text{ for $N\ge3$}.
$$

The coefficients of $$z^4$$ give back the counts from the above list (6 and 18).

## Exercise 9.8 🌟

{% hint style="success" %}
This exercise gives a detailed application of partial fraction decomposition using the [limit method](https://en.wikipedia.org/wiki/Partial_fraction_decomposition#Example_5_\(limit_method\)).
{% endhint %}

We want to find

$$
\[z^N] P\_{MM}(z/M) = \[z^N] \left( z^M \frac{(M-1)!}{\prod\_{k=1}^{M-1} (M - kz)} \right) = \[z^{N-M}] \underbrace{\frac{(M-1)!}{\prod\_{k=1}^{M-1} (M - kz)}}\_{R(z)}.
$$

The partial fraction decomposition gives the following structural form

$$
R(z) = \sum\_{k=1}^{M-1} \frac{B\_k}{M - kz}.
$$

Thankfully to the limit method, each $$B\_k$$ can be handled in a consistent and uniform manner

$$
B\_k = \lim\_{z \to M/k} (M - kz) R(z) = \left. \frac{(M-1)!}{\prod\_{i=1, i \neq k}^{M-1} (M - iz)} \right|\_{z = \frac{M}{k}}.
$$

Plugging in $$z = M/k$$ into the denominator, we get

$$
\prod\_{i \neq k} \left(M - i\frac{M}{k}\right) = \left(\frac{M}{k}\right)^{M-2} \prod\_{i=1, i \neq k}^{M-1} (k - i).
$$

Split the remaining product into positive and negative factorials

$$
\prod\_{i=1, i \neq k}^{M-1} (k - i) = (k-1)! (-1)^{M-1-k} (M-1-k)!.
$$

We now assemble $$B\_k$$

$$
\begin{align\*}
B\_k &= \frac{(M-1)!}{\left(\frac{M}{k}\right)^{M-2} (k-1)! (M-1-k)! (-1)^{M-1-k}} \\
&= k \binom{M-1}{k} \left(\frac{k}{M}\right)^{M-2} (-1)^{M-1-k} \\
&= M \binom{M-1}{k} \left(\frac{k}{M}\right)^{M-1} (-1)^{M-1-k}.
\end{align\*}
$$

Substitute $$B\_k$$ back into the sum for $$R(z)$$. This gives

$$
\begin{align\*}
\[z^{N-M}] R(z) &= \sum\_{k=1}^{M-1} \[z^{N-M}] \frac{B\_k}{M - kz} \\
&= \sum\_{k=1}^{M-1} \[z^{N-M}] \frac{B\_k / M}{1 - \frac{k}{M}z} \\
&= \sum\_{k=1}^{M-1} \frac{B\_k}{M} \left(\frac{k}{M}\right)^{N-M} \\
&= \sum\_{k=1}^{M-1} \binom{M-1}{k} \left(\frac{k}{M}\right)^{M-1} (-1)^{M-1-k} \left(\frac{k}{M}\right)^{N-M} \\
&= \sum\_{k=1}^{M-1} \binom{M-1}{k} (-1)^{M-1-k} \left(\frac{k}{M}\right)^{N-1} \\
&= \sum\_{j=0}^{M-2} \binom{M-1}{j} (-1)^j \left(1 - \frac{j+1}{M}\right)^{N-1} && \text{($j = M - 1 - k$)}\\
&=  \sum\_{0 \le j < M} \binom{M-1}{j} (-1)^j \left(1 - \frac{j+1}{M}\right)^{N-1}.
\end{align\*}
$$

## Exercise 9.9 🌟

{% hint style="success" %}
&#x20;This exercise demonstrates the [inclusion-exclusion principle](https://en.wikipedia.org/wiki/Inclusion%E2%80%93exclusion_principle) in action.
{% endhint %}

Let $$A\_i$$ be the event that a specific coupon $$i$$ is missing from our collection after opening $$N-1$$ boxes. Therefore,

$$
\Pr{X \ge N} = \Pr{A\_1 \cup A\_2 \cup \dots \cup A\_M}.
$$

By the inclusion-exclusion principle, we’ve

$$
\Pr{\text{Union}} = \sum \Pr{A\_i} - \sum \Pr{A\_i \cap A\_j} + \sum \Pr{A\_i \cap A\_j \cap A\_k} - \dots.
$$

The probability of missing a specific set of $$j$$ coupons is $$\left(1 - \frac{j}{M}\right)^{N-1}$$. There are $$\binom{M}{j}$$ such combinations of size $$j$$, and the sign alternates based on $$(-1)^{j-1}$$. After assembling the final formula, we get an equation very similar to the one from the previous exercise

$$
\Pr{X \ge N} = \sum\_{j=1}^{M} (-1)^{j-1} \binom{M}{j} \left(1 - \frac{j}{M}\right)^{N-1}.
$$

## Exercise 9.10

{% hint style="warning" %}
The text of this exercise is vague, assume that we need to find the EGF for M-surjections instead of surjections.
{% endhint %}

Let $$F\_M(z)=\sum\_{N\ge 0}f\_{NM}z^N/N!$$ be our EGF for $$M$$-surjections, where $$f\_{NM}$$ denotes the number of $$M$$-surjections of length $$N$$. The largest urn $$M$$ may contain $$1 \le k \le N-M+1$$ balls. For any fixed $$k$$, the number of ways to form $$M$$-surjections of length $$N$$ is

$$
\binom{N}{k}(M-1)!\left{ \begin{matrix} N-k \ M-1 \end{matrix} \right}.
$$

Now, we just need to sum over $$k$$

$$
f\_{NM}=\sum\_k \binom{N}{k}(M-1)!\left{ \begin{matrix} N-k \ M-1 \end{matrix} \right}=\sum\_k \binom{N}{k}f\_{(N-k)(M-1)}.
$$

The RHS is a binomial sum operation, so

$$
F\_M(z)=(e^z-1)F\_{M-1}(z),
$$

with the initial condition of $$F\_0(z)=1$$. The multiplicative factor of $$e^z-1$$ is due to $$k$$ being a positive integer (starts at 1). Iterating the equation, we get $$F\_M(z)=(e^z-1)^M$$, as expected.

## Exercise 9.11

The next exercise shows that

$$
N!\[z^N]F\_M(z)=\sum\_{j=0}^M  \binom{M}{j}j^N(-1)^{M-j} .
$$

The Python script below uses this formula to decide whether to print out the M-surjection or not. Another possibility would be to temporarily store words as they’re being generated and abort the process if the list becomes larger than the threshold. Otherwise, print the content of the list at the end.&#x20;

{% code expandable="true" %}

```python
import math
import itertools

def f_mn(n, m):
    if m == 0 and n == 0:
        return 1
    if m == 0 or n == 0 or m > n:
        return 0

    total = 0
    for j in range(m + 1):
        sign = -1 if (m - j) % 2 == 1 else 1
        total += sign * math.comb(m, j) * (j ** n)
    return total

def print_m_surjections(n, m, limit=1000):
    count = f_mn(n, m)

    print(f"--- {m}-surjections of length {n} ---")
    print(f"Total count: {count}")

    if count == 0:
        print("No valid surjections possible (N must be >= M > 0).")
        return

    if count >= limit:
        print(f"The number of objects ({count}) meets or exceeds the limit of {limit}.")
        print("Skipping explicit generation.")
        return

    alphabet = [str(i) for i in range(1, m + 1)]
    valid_surjections = []
    for word_tuple in itertools.product(alphabet, repeat=n):
        if len(set(word_tuple)) == m:
            valid_surjections.append("".join(word_tuple))

    for idx, surjection in enumerate(valid_surjections):
        print(f"{idx + 1:>3}. {valid_surjections[idx]}")

# Example 1: 3-surjections of length 4 (Should be 36 objects)
print_m_surjections(n=4, m=3)
# Example 2: 4-surjections of length 4 (Should be 24 objects - i.e., 4!)
print_m_surjections(n=4, m=4)
# Example 3: Exceeds the limit of 1000
print_m_surjections(n=8, m=6)
```

{% endcode %}

## Exercise 9.12

Based on [Exercise 9.8](#exercise-9.8) we’ve

$$
\sum\_{j=0}^{M-1} \binom{M-1}{j} (-1)^j \left(1 - \frac{j+1}{M}\right)^{N-1} = \frac{M!}{M^N} \begin{Bmatrix} N-1 \ M-1 \end{Bmatrix}.
$$

Get a common denominator inside the power: $$\left(\frac{M - j - 1}{M}\right)^{N-1}$$. Now, substitute the index $$k = M - 1 - j$$ to get

$$
\sum\_{k=0}^{M-1} \binom{M-1}{k} (-1)^{M-1-k} \frac{k^{N-1}}{M^{N-1}} = \frac{M \cdot (M-1)!}{M \cdot M^{N-1}} \begin{Bmatrix} N-1 \ M-1 \end{Bmatrix},
$$

which simplifies to

$$
\sum\_{k=0}^{M-1} \binom{M-1}{k} (-1)^{M-1-k} k^{N-1} = (M-1)! \begin{Bmatrix} N-1 \ M-1 \end{Bmatrix}.
$$

{% hint style="info" %}
Because $$N-1$$ is just a placeholder for the number of items being partitioned, this identity holds for any generic exponent $$i$$.
{% endhint %}

We begin with the LHS in our exercise (recall that $$N!\[z^N]e^{jz}=j^N$$ from Table 3.4)

$$
\begin{align\*}
\sum\_{j=0}^{M} \binom{M}{j} (-1)^{M-j} j^N &= \sum\_{j=1}^{M} \binom{M}{j} (-1)^{M-j} j \cdot j^{N-1} \\\[0.4cm]
&= M \sum\_{j=1}^{M} \binom{M-1}{j-1} (-1)^{M-j} j^{N-1} && \text{$\left( j \binom{M}{j} = M \binom{M-1}{j-1} \right)$} \\\[0.4cm]
&= M \sum\_{k=0}^{M-1} \binom{M-1}{k} (-1)^{M-1-k} (k+1)^{N-1} && \text{($k = j - 1$)} \\\[0.4cm]
&= M \sum\_{k=0}^{M-1} \binom{M-1}{k} (-1)^{M-1-k} \left\[ \sum\_{i=0}^{N-1} \binom{N-1}{i} k^i \right] \\\[0.4cm]
&= M \sum\_{i=0}^{N-1} \binom{N-1}{i} \left\[ \sum\_{k=0}^{M-1} \binom{M-1}{k} (-1)^{M-1-k} k^i \right] \\\[0.4cm]
&= M \sum\_{i=0}^{N-1} \binom{N-1}{i} (M-1)! \begin{Bmatrix} i \ M-1 \end{Bmatrix} && \text{(see above)}\\\[0.4cm]
&= M! \sum\_{i=0}^{N-1} \binom{N-1}{i} \begin{Bmatrix} i \ M-1 \end{Bmatrix} \\\[0.4cm]
&=  M! \begin{Bmatrix} N \ M \end{Bmatrix}.
\end{align\*}
$$

## Exercise 9.13

The EGF immediately follows from Table 3.6, but we can easily derive it using the symbolic method. A partition of elements into indistinguishable nonempty groups (subsets) can be structurally defined as a set of nonempty sets of elements. This can be expressed as

$$
\mathcal{F}=\text{SET}(\text{SET}\_{>0}(\mathcal{Z})) \implies F(z)=e^{e^z-1}.
$$

## Exercise 9.14

Wikipedia has the full [proof](https://en.wikipedia.org/wiki/Dobi%C5%84ski%27s_formula#Proof).

## Exercise 9.15

This directly follows from marking the inner sets by $$u$$ (see [Exercise 9.13](#exercise-9.13)). Therefore,

$$
\mathcal{F}=\text{SET}(u \times \text{SET}\_{>0}(\mathcal{Z})) \implies F(z,u)=e^{u(e^z-1)}.
$$

The coefficient $$u^k$$ represents partitions into exactly $$k$$ subsets. The size of those individual subsets is tracked by the $$z$$ variable.

## Exercise 9.16

{% hint style="danger" %}
The book contains a typo. The correct recurrence is $$F\_{NM} = \sum\_{k=0}^{N} \binom{N}{k} F\_{(N-k)(M-1)}$$.
{% endhint %}

Here, the functional equation is simpler than in [Exercise 9.10](#exercise-9.10) (with the same initial condition). This gives

$$
F\_M(z) = e^z F\_{M-1}(z).
$$

Let $$u$$ mark the alphabet size $$M$$, and let's construct $$F(z, u) = \sum\_{M\ge0} F\_M(z) u^M$$. We can establish the following relationship from the functional equation

$$
\sum\_{M\ge1} F\_M(z) u^M = \sum\_{M\ge1} e^z F\_{M-1}(z) u^M.
$$

This is equivalent to

$$
F(z, u) - F\_0(z) = u e^z \sum\_{M\ge1} F\_{M-1}(z) u^{M-1}.
$$

Now, we can derive the requested BGF

$$
F(z, u) - 1 = u e^z F(z, u) \implies \boxed{F(z, u) = \frac{1}{1 - u e^z}}.
$$

## Exercise 9.17

The EGF for $$M$$-words with all letter frequencies even is (see Table 3.3 in the book)

$$
(1 + z^2/2! + z^4/4!+... + z^{2k}/(2k)! + ...)^M = \frac{(e^z+e^{-z})^M}{2^M}.
$$

## Exercise 9.18

The EGF for an urn requiring $$>k$$ balls is $$e^z - \sum\_{i=0}^k \frac{z^i}{i!}$$. The lowest order term in this series is $$\frac{z^{k+1}}{(k+1)!}$$.\
When we raise this EGF to the power of $$M$$, the very first non-zero term is

$$
F\_{M,>k}(z)= \left( \frac{z^{k+1}}{(k+1)!} \right)^M + \dots = \frac{z^{M(k+1)}}{\left((k+1)!\right)^M} + \dots
$$

This gives

$$
(M(k+1))! \[z^{M(k+1)}] F\_{M,>k}(z) =\boxed{ \frac{(M(k+1))!}{\left((k+1)!\right)^M}}.
$$

Using Theorem 9.3, the EGF for an urn with $$<(k+2)$$ balls is $$\sum\_{i=0}^{k+1} \frac{z^i}{i!}$$. We want to extract the coefficient for $$M(k+1) - 1$$ from this EGF raised to the power of $$M$$. To build $$z^{M(k+1)-1}$$ from the product of $$M$$ polynomials (where the max term is $$z^{k+1}$$), we are forced to pick the $$z^{k+1}/(k+1)!$$ term exactly $$M-1$$ times and the $$z^k/k!$$ term exactly once. There are $$M$$ ways to choose which of the $$M$$ polynomials contributes the $$z^k$$ term. Therefore, the extracted coefficient multiplied by $$(M(k+1) - 1)!$$ and $$M$$ is

$$
(M(k+1)-1)! \left( M \times \frac{1}{k! \left((k+1)!\right)^{M-1}} \right)=\frac{M(k+1) \times (M(k+1) - 1)!}{(k+1) \times k! \left((k+1)!\right)^{M-1}}= \boxed{\frac{(M(k+1))!}{\left((k+1)!\right)^M}}.
$$

## Exercise 9.19 🌟

{% hint style="success" %}
This exercise builds upon the analysis of the birthday paradox and showcases an interesting fact about collisions in hash tables.
{% endhint %}

{% hint style="warning" %}
The text of this exercise asks us to derive a formula for $$N$$ urns, but we’ll denote the number of urns with $$M$$ to retain consistency.
{% endhint %}

Let $$X$$ be the number of balls thrown until the *second* collision occurs. The expected value can be computed using the tail sum formula

$$
\mathbb{E}\[X]=\sum\_k \Pr{X>k}.
$$

{% hint style="info" %}
Theorem 9.1 speaks in terms of UNTIL condition, whilst this exercise mentions BEFORE. Asymptotically there is no difference, but in discrete terms we have $$\mathbb{E}\[\text{Before}] = \mathbb{E}\[\text{Until}] - 1$$.
{% endhint %}

The probability that there are no collisions when $$N$$ balls are thrown into $$M$$ urns is given by Theorem 9.1. Therefore,

$$
\Pr\_0{X>N} = \binom{M}{N}\frac{N!}{M^N}.
$$

If exactly 1 collision occurred in $$N$$ throws, it means exactly $$N-1$$ distinct urns are occupied. One urn contains exactly 2 balls, and the other $$N-2$$ urns contain exactly 1 ball. Thus,

$$
\Pr\_1{X>N}=\frac{\dbinom{M}{N-1}(N-1)!\dbinom{N}{N-2}}{M^N}.
$$

This gives

$$
\mathbb{E}\[X]=\sum\_k( \Pr\_0{X>k}  + \Pr\_1{X>k}).
$$

The first sum is (see Theorem 9.1) $$\sim \sqrt{\frac{\pi M}{2}}$$. To solve the second sum, let's rewrite it

$$
\Pr\_1{X>k}=\frac{k(k-1)}{2M} \cdot \frac{M!}{(M-(k-1))! M^{k-1}} = \frac{k(k-1)}{2M} \Pr\_0{X>k-1}.
$$

Now, we proceed similarly as in [Exercise 9.6](#exercise-9.6)

$$
\sum\_k \Pr\_1{X>k} \sim \int\_{0}^{\infty} \frac{x^2}{2M} e^{\frac{-x^2}{2M}} dx=\frac{1}{2} \sqrt{\frac{\pi M}{2}}.
$$

{% hint style="info" %}
The integral is listed as of similar [form](https://en.wikipedia.org/wiki/Gaussian_integral#Integrals_of_similar_form) of the standard Gaussian integral on Wikipedia.
{% endhint %}

The total expected time until the second collision is the sum of both parts  $$\sim  \frac{3}{2} \sqrt{\frac{\pi M}{2}}$$. It tells us that asymptotically, getting the second collision takes roughly 50% longer than getting the first one.

## Exercise 9.20

{% hint style="warning" %}
The text of this exercise asks us to derive a formula for $$N$$ urns, but we’ll denote the number of urns with $$M$$ to retain consistency.
{% endhint %}

We can repeat more or less the same math as in the previous exercise to conclude that the leading asymptotic term remains intact. Instead, let’s argue in a different way. At the time the first collision happens, we have thrown $$\sim \sqrt{M}$$ balls. At this exact moment, the board looks like this:

* Exactly 1 urn has 2 balls.
* Roughly $$\sqrt{M}$$ urns have exactly 1 ball.

The probability that the next thrown ball hits the single urn that already has 2 balls is $$1/M$$. The probability it hits one of the urns with 1 ball (triggering our new definition of a collision) is roughly $$\frac{\sqrt{M}}{M} = \frac{1}{\sqrt{M}}$$. Because $$\frac{1}{\sqrt{M}}$$ is massively larger than $$\frac{1}{M}$$ as $$M \to \infty$$, it’s overwhelmingly statistically likely that the "next" collision will hit a 1-ball urn rather than the 2-ball urn. The events we’re trying to ignore (balls falling into urns with more than 2 balls) are so extremely rare, at this stage of the process, that they don’t affect the leading asymptotic term!

## Exercise 9.21 🌟

{% hint style="success" %}
This exercise demonstrates the multinomial theorem in action.
{% endhint %}

The book already defines for us the EGF for words comprising at most $$k$$ occurrences of each of $$M$$ different letters. Here, we have $$k=2$$

$$
F\_M(z)= \left( 1 + z + \frac{z^2}{2} \right)^M.
$$

We need to find the explicit expression for the number of words, which is $$N!\[z^N]F\_M(z)$$. Using the multinomial expansion, the EGF is

$$
F\_M(z) = \sum\_{k\_0 + k\_1 + k\_2 = M} \frac{M!}{k\_0! k\_1! k\_2!} (1)^{k\_0} (z)^{k\_1} \left(\frac{z^2}{2}\right)^{k\_2}.
$$

Substitute $$k\_1 = N - 2k\_2$$, which also forces $$k\_0 = M - N + k\_2$$. This gives

$$
F\_M(z) = \sum\_{k} \frac{M!}{(M - N + k)! (N - 2k)! k!} \frac{z^N}{2^k} \\\[0.4cm]
\therefore N!\[z^N]F\_M(z) = N! \sum\_{k} \frac{M!}{(M - N + k)! (N - 2k)! k!} \frac{1}{2^k}=\sum\_{k=0}^{\lfloor N/2 \rfloor} \boxed{\binom{M}{N-k} \binom{N-k}{k} \frac{N!}{2^k}}.
$$

{% hint style="info" %}
Let $$k$$ be the number of letters that appear exactly twice. We can pick $$N-k$$ distinct letters in $$\binom{M}{N-k}$$ ways from an alphabet of size $$M$$. Among these, we can select $$\binom{N-k}{k}$$ to be duplicates. Finally, there are $$N!/2^k$$ ways to reorder them. Summing over all $$k$$ gives our explicit formula.
{% endhint %}

## Exercise 9.22

This is a variation of the previous exercise with $$M=365$$. The probability of not having a triplet is

$$
\frac{N!}{365^N}  \[z^N] \left( 1 + z + \frac{z^2}{2} \right)^{365}.
$$

We can computationally evaluate this exact coefficient extraction to build the curve shown below.

<figure><img src="/files/PuaCmASmuAjJdH9Mkgct" alt=""><figcaption></figcaption></figure>

## Exercise 9.23

The plot from the previous exercise already gives the answer for three people. The explicit formula for the probability of not having a four-person shared birthday is

$$
\frac{N!}{365^N} \[z^N] \left( 1 + z + \frac{z^2}{2!} + \frac{z^3}{3!} \right)^{365}.
$$

It turns out, that crossing the 50% threshold for four people with shared birthday happens at $$N = 187$$.

## Exercise 9.24 🌟

{% hint style="success" %}
This exercise introduces the [double dixie cup](https://doi.org/10.2307/2308930) problem.
{% endhint %}

Let $$X$$ be the number of balls thrown until every urn has at least 2 balls. The event $$X \le N$$ means that after exactly $$N$$ throws, all $$M$$ urns contain at least two balls. Theorem 9.4 from the book immediately gives

$$
\Pr{X \le N} = \frac{N!}{M^N} \[z^N] (e^z - 1 - z)^M.
$$

Therefore, the average number of balls thrown into $$M$$ urns before each urn is filled at least twice is

$$
\mathbb{E}\[X] = \sum\_{N\ge0} \left( 1 - \frac{N!}{M^N} \[z^N] (e^z - 1 - z)^M \right) = M \ln M + M \ln \ln M + O(M).
$$

{% hint style="info" %}
Intuitively, we might think that filling every urn with at least 2 balls takes x2 times as much as filling them with 1 ball. But this is obviously not the case. By the time we finally fill the last empty urn, all the others already contain more than one ball. Filling them a second time requires barely any extra effort.
{% endhint %}

## Exercise 9.25 🌟

{% hint style="success" %}
This model is the direct "dual" to the double dixie cup work (see the previous exercise). While the dixie cup problem looks at the expected time $$N$$ to reach a certain occupancy, this model looks at the expected occupancy for a fixed time.
{% endhint %}

Let $$X$$ be the random variable representing the minimal occupancy (the number of balls in the least-filled urn) when distributing $$N$$ balls into $$M$$ urns. Theorem 9.4 gives

$$
\mathbb{E}\[X] = \sum\_{k\ge1} P(X \ge k) = \frac{1}{M^N} \sum\_{k\ge1} N! \[z^N] \left( e^z - \sum\_{j=0}^{k-1} \frac{z^j}{j!} \right)^M.
$$

If we multiply the average value with the total number of combinatorial objects, then we get the CGF

$$
C\_M(z) = \sum\_{k=1}^{\infty} \left( e^z - \sum\_{j=0}^{k-1} \frac{z^j}{j!} \right)^M.
$$

The following Python code tabulates the values for M and N less than 20. The function `count_valid_distributions` uses dynamic programming and recurrence from [Exercise 9.16](#exercise-9.16) to maintain required precision and accuracy.

{% code expandable="true" %}

```
import math

def count_valid_distributions(m, n, k):
    """
    Calculates the exact number of ways to distribute N distinct balls
    into M distinct urns such that EVERY urn contains at least k balls.
    """
    # dp[m][n] = the number of valid distributions for m urns and n balls.
    dp = [[0] * (n + 1) for _ in range(m + 1)]

    # Base case: 1 way to distribute 0 balls into 0 urns.
    dp[0][0] = 1

    for p in range(1, m + 1):
        for q in range(n + 1):
            ways = 0
            for r in range(k, q + 1):
                ways += math.comb(q, r) * dp[p - 1][q - r]
            dp[p][q] = ways
    return dp[m][n]

def expected_minimal_occupancy(m, n):
    """
    Calculates the expected minimal occupancy for N balls in M urns
    using the tail sum formula.
    """
    # If there are more urns than balls, the pigeonhole principle
    # guarantees at least one urn is empty (min occupancy = 0).
    if m == 0 or n < m:
        return 0.0

    total = m ** n
    s = 0
    # The maximum possible minimal occupancy is N // M
    for k in range(1, n // m + 1):
        s += count_valid_distributions(m, n, k)
    return s / total

n_vals = range(2, 20)
m_vals = range(2, 20)

header_row = f"| {'N \\ M':<5} | " + " | ".join([f"{m:>6}" for m in m_vals]) + " |"
separator = "|-------|" + "|".join(["--------" for _ in m_vals]) + "|"
print(header_row)
print(separator)
for n in n_vals:
    row_strs = [f"| {n:<2}   "]
    for M in m_vals:
        val = expected_minimal_occupancy(M, n)
        if val == 0:
            row_strs.append(f"{0:>6}")
        else:
            row_strs.append(f"{val:>6.4f}")
    print(" | ".join(row_strs) + " |")
```

{% endcode %}

## Exercise 9.26 🌟

{% hint style="success" %}
This exercise highlights the utility of indicator random variables in solving combinatorial problems (see also the next exercise). They can also give an idea how to formally describe the corresponding combinatorial class using the symbolic method.
{% endhint %}

This problem can be easily solved using indicator random variables $$I\_i$$ for each position in the word

$$
I\_i= \begin{cases}
1 &\text{if a block starts at position }  i, \\
0 &\text{if the block continues the previous letter}.
\end{cases}
$$

We know that $$\mathbb{E}\[I\_1]=\Pr{I\_1=1}=1$$, since the first block is always new. For the other blocks

$$
\mathbb{E}\[I\_i] = \Pr{I\_i = 1} = \frac{M-1}{M},
$$

as $$M-1$$ different characters may be at the previous position. If the current letter is different, then it starts a new block. The total number of blocks in the entire word is $$X=\sum\_i I\_i$$. The average number of blocks of contiguous equal elements in a random word is

$$
\mathbb{E}\[X] = \mathbb{E}\[I\_1] + \sum\_{i=2}^{N} \mathbb{E}\[I\_i] = 1 + (N-1) \left( \frac{M-1}{M} \right).
$$

{% hint style="info" %}
As a quick sanity check, the predicted average for Table 9.2 in the book is 3, which perfectly matches the data.
{% endhint %}

#### Modeling via BGF

For the sake of completeness, let’s develop the BGF for this problem using the symbolic method. The average can be computed by using Table 3.5 from the book (we already know the answer).

$$
\mathcal{B}=\text{SEQ}(\mathcal{Z})-\epsilon \\\[0.2cm]
\mathcal{F}=\epsilon+\underbrace{uM\mathcal{B}}*{\text{first block}} \times \text{SEQ}(\underbrace{\mathcal{u(M-1)B}}*{\text{other block}}).
$$

Observe, that we mark blocks (nonempty contiguous equal elements). We’ve $$M$$ choices for the first block and $$M-1$$ for all subsequent ones. The BGF immediately follows from the corresponding transfer theorem

$$
F(z, u) = 1 + \frac{ M \dfrac{uz}{1-z} }{ 1 - (M-1) \dfrac{uz}{1-z} }= \boxed{1 + \frac{M u z}{1 - z - (M-1) u z}}.
$$

## Exercise 9.27

Let's look at any adjacent pair of letters $$w\_i$$ and $$w\_{i+1}$$ from an $$M$$-letter alphabet. Since the word is completely random, there are $$M^2$$ possible pairs for these two positions. We can partition them into three categories:

1. $$w\_i < w\_{i+1}$$ (rise)
2. $$w\_i > w\_{i+1}$$ (fall)
3. $$w\_i = w\_{i+1}$$ (flat continuation)

Because of symmetry, the number of rising pairs exactly equals the number of falling pairs. Specifically, out of the $$M^2$$ pairs, exactly $$M$$ of them are flat (the two letters are the same). The remaining $$M^2 - M$$ pairs are evenly split between rises and falls. Therefore, the number of  rising/falling pairs is $$\frac{M^2 - M}{2} = \binom{M}{2}$$.

The probability of a rise at any specific position $$i$$ is

$$
\Pr{\text{Rise}\_i} = \frac{\frac{M^2 - M}{2}}{M^2} = \frac{M-1}{2M}.
$$

There are $$N-1$$ adjacent pairs in a word of length $$N$$. Let $$I\_i$$ be an indicator random variable that equals 1 if a rise occurs at position $$i$$. By the linearity of expectation and symmetry

$$
\mathbb{E}\[\text{Falls}]=\mathbb{E}\[\text{Rises}] = \sum\_{i=1}^{N-1}\mathbb{E}\[I\_i] = \sum\_{i=1}^{N-1} \Pr{\text{Rise}\_i}= (N-1) \left( \frac{M-1}{2M} \right).
$$

A run is a maximal contiguous subsequence of elements. We’ve to be careful about whether a run is strictly increasing or non-decreasing. Let’s use the latter. The first letter always starts a fresh run. A new run starts at position $$i+1$$ if and only if $$w\_i > w\_{i+1}$$ (a fall). The expected number of non-decreasing runs is

$$
\mathbb{E}\[\text{Runs}] = \mathbb{E}\[\text{Falls}]+1=1 + (N-1) \left( \frac{M-1}{2M} \right).
$$

{% hint style="info" %}
This definition is aligned with the similar property of runs and falls for permutations, as explained in the book.
{% endhint %}

## Exercise 9.28

The probability is $$100 \times100^{-100}=100^{-99}$$. We can sum the individual probabilities, as the events for different urns are mutually exclusive (only one urn can possibly hold all balls).

## Exercise 9.29

The probability is $$100! \times 100^{-100}$$, since we have 100! possible mappings of balls to urns.

## Exercise 9.30

Since we are tracking empty urns, we set $$k=0$$. Substituting this into the EBGF from the book (see the subsection about alternative derivations), we get

$$
F(z,u) =  \left( e^z + u - 1 \right)^M.
$$

Let $$U$$ be the number of empty urns. We get

$$
\mathbb{E}\[U] = M(1 - 1/M)^N.
$$

The second derivative of $$F$$ evaluated at $$u=1$$ is

$$
\left. \frac{\partial^2 F}{\partial u^2} \right|\_{u=1} = M(M-1)e^{(M-2)z}.
$$

The total cumulated cost of pairs of empty urns is

$$
N! \[z^N] M(M-1)e^{(M-2)z} = M(M-1)(M-2)^N.
$$

Using Table 3.5 from the book, we are ready to plug in all parts into the variance formula

$$
\text{Var}(U) = M(M-1)\left(1 - \frac{2}{M}\right)^N + M\left(1 - \frac{1}{M}\right)^N - M^2\left(1 - \frac{1}{M}\right)^{2N}.
$$

Using $$\ln(1-x) \sim -x - x^2/2$$, let's expand the terms for $$N/M=\alpha >0$$ fixed as $$N,M \to \infty$$.

$$
\left(1 - \frac{2}{M}\right)^{\alpha M} \sim \exp\left( \alpha M \left( -\frac{2}{M} - \frac{2}{M^2} \right) \right) = \exp\left( -2\alpha - \frac{2\alpha}{M} \right) \sim e^{-2\alpha}\left(1 - \frac{2\alpha}{M}\right).
$$

Therefore,

$$
M(M-1) e^{-2\alpha} \left(1 - \frac{2\alpha}{M}\right) \approx M^2 e^{-2\alpha} - 2\alpha M e^{-2\alpha} - M e^{-2\alpha}.
$$

The quadratic expectation term is

$$
\left(1 - \frac{1}{M}\right)^{2\alpha M} \sim \exp\left( 2\alpha M \left( -\frac{1}{M} - \frac{1}{2M^2} \right) \right) = \exp\left( -2\alpha - \frac{\alpha}{M} \right) \sim e^{-2\alpha}\left(1 - \frac{\alpha}{M}\right).
$$

The book already provides $$\mathbb{E}\[U]$$, thus,

$$
(\mathbb{E}\[U])^2 \sim M^2 e^{-2\alpha} - \alpha M e^{-2\alpha}.
$$

Put all three expanded pieces back into the variance formula

$$
\text{Var}(U) \sim \left\[ M^2 e^{-2\alpha} - 2\alpha M e^{-2\alpha} - M e^{-2\alpha} \right] + \left\[ M e^{-\alpha} \right] - \left\[ M^2 e^{-2\alpha} - \alpha M e^{-2\alpha} \right]= M e^{-\alpha} \left( 1 - (1+\alpha)e^{-\alpha} \right).
$$

Thus, the standard deviation is

$$
\sigma\_U \sim \sqrt{M} \cdot \sqrt{e^{-\alpha}\bigl(1 - (1+\alpha)e^{-\alpha}\bigr)}.
$$

## Exercise 9.31 🌟

{% hint style="success" %}
This exercise counterbalances the approach with indicator random variables; it showcases why EGFs are so incredibly powerful. Trying to solve this with discrete indicator variables or inclusion-exclusion principle would be a combinatorial nightmare.
{% endhint %}

The EGF for a single urn is (see Table 3.3 in the book)

$$
U\_{\text{even}}(z) = \sum\_{k\ge0} \frac{z^{2k}}{(2k)!} = \frac{e^z + e^{-z}}{2}.
$$

The EGF for $$M$$ urns is simply an $$M$$-sequence of the above EGF

$$
F(z) = \left( \frac{e^z + e^{-z}}{2} \right)^M= \frac{1}{2^M} \sum\_{j=0}^{M} \binom{M}{j} e^{(2j-M)z}.
$$

The total number of valid distributions is

$$
N!\[z^N]F(z) = \frac{1}{2^M} \sum\_{j=0}^{M} \binom{M}{j} (2j-M)^N.
$$

Therefore, the probability that each urn will contain an even number of balls when $$N$$ balls are distributed among $$M$$ urns is

$$
\Pr{\text{All Even}} = \frac{1}{2^M M^N} \sum\_{j=0}^{M} \binom{M}{j} (2j-M)^N.
$$

#### Quick Sanity Check of our Formula

If $$N$$ is odd, then it’s impossible to divide an odd number of balls into a sum of strictly even numbers. So, the probability must be exactly 0. Interestingly, this is handled for us automatically by the EGF machinery. The binomial coefficient is symmetric, so $$\binom{M}{j} = \binom{M}{M-j}$$. Let’s see what do we get by summing such symmetric entries (for $$N$$ odd, $$(-x)^N=-x^N$$)

$$
(2j-M)^N \leftrightarrow (2(M-j)-M)^N = (M-2j)^N = -(2j-M)^N.
$$

They annihilate each other, hence the total sum is zero. When $$M$$ is even, we also have one singular term for $$j=M/2$$, but this trivially evaluates to zero.

## Exercise 9.32

$$C^{\[M]}*{Nk}$$ represents the total number of $$M$$-words of length $$N$$ such that a specific letter occurs exactly $$k$$ times. Assume $$N\ge k$$. The base case is $$C^{\[M]}*{00}=1$$. The recurrence relationship follows from the following observations:

* We can select $$M-1$$ different letters for expanding the $$M$$-words of length $$N-1$$ already containing $$k$$ occurrences of a specific letter.
* We can append a letter to $$M$$-words of length $$N-1$$ containing $$k-1$$ occurrences of that letter.

The Python script below computes these values.

{% code expandable="true" %}

```python
def print_occupancy_distribution(n, m):
    c = [[0] * (n + 1) for _ in range(n + 1)]
    c[0][0] = 1
    for n in range(1, n + 1):
        for k in range(n + 1):
            c[n][k] = (m - 1) * c[n - 1][k] + (c[n - 1][k - 1] if k > 0 else 0)

    print(f"Occupancy distribution for N = {n} balls and M = {m} urns")
    print("-" * 34)
    print(f"{'k':<7} | {'C[M](N,k)':<10} | {'Probability':<20}")
    print("-" * 34)

    total_words = m ** n
    for k in range(n + 1):
        p = c[n][k] / total_words
        print(f"{k:<7} | {c[n][k]:<10} | {p:.6f}")

# Reproduce the last row from Table 9.7.
print_occupancy_distribution(n=6, m=3)
```

{% endcode %}

## Exercise 9.33

We need to use Theorems 6.6 (see also [Exercise 6.37](https://evarga.gitbook.io/sh-aofa/algorithms-and-combinatorial-structures/pages/f93B9hYr4j4XMcDmq8le#exercise-6.37)) and 9.6 from the book.

{% hint style="warning" %}
As the official errata points out, the formulae in Theorem 9.6 are different than those given in Table 9.9. Therefore, the results below eesentially illustrate the methodology, which is the most important part of the whole story.
{% endhint %}

In case of an unsuccessful search, we have:

* For $$N=1000$$ we want $$1000/M \le 2H\_{1001}-2 \implies M \ge 78$$.
* For $$N=10^6$$ we want $$10^6/M \le 2H\_{10^6+1}-2\implies M \ge 37334$$.

In case of a successful search, we have:

* For $$N=1000$$ we want $$999/(2M) \le 2H\_{1000}-3+ 2H\_{1000}/1000 \implies M \ge 42$$.
* For $$N=10^6$$ we want $$(10^6-1)/(2M) \le 2H\_{10^6}-3+ 2H\_{10^6}/10^6 \implies M \ge 19391$$.

{% hint style="info" %}
In essence, the analysis highlights the general relationship: to beat a binary search tree of size $$N$$, we need a hash table of size $$M=\Omega(\frac{N}{\ln N})$$.
{% endhint %}

## Exercise 9.34 🌟

{% hint style="success" %}
This exercise introduces a powerful concept in probability theory called the [law of total variance](https://en.wikipedia.org/wiki/Law_of_total_variance).
{% endhint %}

Let $$S$$ be the random variable representing the number of comparisons (probes) for a successful search. Imagine we pick one of the $$N$$ keys in the table uniformly at random. We want to find the search cost for this specific key. Let $$X$$ be the number of other keys that hashed into the exact same list as our target key. Since each of the remaining $$N-1$$ keys had an independent $$p=1/M$$ chance of landing in this list, $$X$$ perfectly follows a binomial distribution. From this, we know the mean and variance of $$X$$:

* $$\mathbb{E}\[X] = (N-1)p$$
* $$\text{Var}(X) = (N-1)p(1-p)$$

If there are $$X$$ other keys in the list, the total length of the chain is $$X+1$$. Because our target key is equally likely to have been inserted at any position in this chain, the search cost $$S$$ given $$X$$ is uniformly distributed on the integers $${1, 2, \dots, X+1}$$. For a [discrete uniform distribution](https://en.wikipedia.org/wiki/Discrete_uniform_distribution) on $${1, \dots, n}$$, the mean is $$\frac{n+1}{2}$$ and the variance is $$\frac{n^2-1}{12}$$. Substituting $$n = X+1$$, we get our conditional moments:

* $$\mathbb{E}\[S \mid X] = \frac{X+2}{2} = 1 + \frac{X}{2}$$
* $$\text{Var}(S \mid X) = \frac{(X+1)^2 - 1}{12} = \frac{X^2 + 2X}{12}$$

According to the law of total variance, we have

$$
\text{Var}(S) = \mathbb{E}\[\text{Var}(S \mid X)] + \text{Var}(\mathbb{E}\[S \mid X]).
$$

Let’s compute the pieces

$$
\begin{align\*}
\text{Var}( \mathbb{E}\[S \mid X]) &= \text{Var}\left(1 + \frac{X}{2}\right) = \frac{1}{4}\text{Var}(X) = \frac{(N-1)p(1-p)}{4} && \text{($ \text{Var}(\alpha X)=\alpha^2\text{Var}(X)$)}, \\\[0.5cm]
\hline \\
\mathbb{E}\[\text{Var}(S \mid X)] &= \mathbb{E}\left\[ \frac{X^2 + 2X}{12} \right] \\
&= \frac{1}{12} \left( \mathbb{E}\[X^2] + 2\mathbb{E}\[X] \right) \\
&= \frac{1}{12} \Big( (N-1)p(1-p) + (N-1)^2p^2 + 2(N-1)p \Big) && \text{($\mathbb{E}\[X^2] = \text{Var}(X) + (\mathbb{E}\[X])^2$)}.
\end{align\*}
$$

Adding them together and simplifying, we get

$$
\text{Var}(S) = \frac{1}{2}(N-1)p - \frac{1}{3}(N-1)p^2 + \frac{1}{12}(N-1)^2 p^2= \frac{N-1}{2M} + \frac{(N-1)(N-4)}{12M^2}.
$$

This gives the standard deviation of the number of comparisons required for a successful search in hashing with separate chaining

$$
\sigma\_S = \sqrt{ \frac{N-1}{2M} - \frac{N-1}{3M^2} + \frac{(N-1)^2}{12M^2} } \approx  \sqrt{ \frac{\alpha}{2} + \frac{\alpha^2}{12} } \qquad \text{(with a load factor $\alpha = N/M$)}.
$$

## Exercise 9.35 🌟

{% hint style="success" %}
This exercise is a case study how to analyze optimization techniques, like in this case, keeping the chains sorted. Without developing a mathematical model, we may never be sure do we have a good or bad idea.
{% endhint %}

Sorting the lists does absolutely nothing to improve the average cost of a successful search! If a key is actually in the table, we still have to traverse the list until we find it. Because its rank among the keys in that list (chain) is perfectly uniform, we’ll search exactly as far on average as we would in an unsorted list. Therefore, keeping the chains sorted is a mechanism specifically designed to optimize unsuccessful searches.

Let $$X$$ be the number of keys that hashed into the exact same list as our search key. Let $$P$$ be the number of comparisons (probes) we make. The conditional average is

$$
\mathbb{E}\[P \mid X] = \frac{1}{X+1} \left( \left(\sum\_{i=1}^X i\right) + \bold{X} \right) = \frac{X}{2} + 1 - \frac{1}{X+1}.
$$

The bolded $$X$$ in the above formula represents the case of hitting the end of the list.

To find the unconditional average $$\mathbb{E}\[P]$$, we use the fact that the number of keys in any list perfectly follows a Poisson distribution in the asymptotic limit $$N, M \to \infty$$ with $$\alpha=Np=N/M$$:

$$
\lim\_{N, M \to \infty} \text{Binomial}(N, p=1/M) = \text{Poisson}(\alpha).
$$

{% hint style="info" %}
Shifting to a Poisson distribution drastically simplifies the calculus without impacting the asymptotic conclusions. For example,\
$$E\left\[\frac{1}{X+1}\right] = \sum\_{k} \frac{1}{k+1} \frac{\alpha^k e^{-\alpha}}{k!} = \sum\_{k} \frac{\alpha^k e^{-\alpha}}{(k+1)!}= \frac{e^{-\alpha}}{\alpha} \sum\_{k} \frac{\alpha^{k+1}}{(k+1)!}= \frac{e^{-\alpha}}{\alpha} (e^\alpha - 1) = \mathbf{\frac{1 - e^{-\alpha}}{\alpha}}$$.\
Computing this expectation whilst sticking to the exact binomial distribution would be a nightmare.
{% endhint %}

According to the [law of total expectation](https://en.wikipedia.org/wiki/Law_of_total_expectation), we have

$$
\mathbb{E}\[P] =\mathbb{E}\[\mathbb{E}\[P \mid X]]= \frac{\mathbb{E}\[X]}{2} + 1 - \mathbb{E}\left\[\frac{1}{X+1}\right]= \frac{\alpha}{2} + 1 - \frac{1 - e^{-\alpha}}{\alpha} \approx \frac{\alpha}{2}.
$$

{% hint style="info" %}
Obviously, sorting cuts, on the average, the search time by half. Of course, this must be balanced with the cost of maintaining the chain sorted. In practice, this almost never worths the effort.
{% endhint %}

We also need

$$
\mathbb{E}\[P^2 \mid X] = \frac{1}{X+1} \left( \left(\sum\_{i=1}^X i^2\right) + X^2 \right) = \frac{1}{X+1} \left( \frac{X(X+1)(2X+1)}{6} + X^2 \right) = \frac{X^2}{3} + \frac{7X}{6} - 1 + \frac{1}{X+1}.
$$

This gives (we know that $$\mathbb{E}\[X]=\text{Var}(X)=\alpha$$)

$$
\mathbb{E}\[P^2] =\mathbb{E}\[\mathbb{E}\[P^2 \mid X]]= \frac{\alpha^2 + \alpha}{3} + \frac{7\alpha}{6} - 1 + \frac{1 - e^{-\alpha}}{\alpha} = \frac{\alpha^2}{3} + \frac{3\alpha}{2} - 1 + \frac{1 - e^{-\alpha}}{\alpha}.
$$

Now, we can compute the standard deviation as $$\sqrt {\text{Var}(P) }= \sqrt{\mathbb{E}\[P^2] - (\mathbb{E}\[P])^2}$$

$$
\sigma\_P = \sqrt{\frac{\alpha^2}{12} + \frac{\alpha}{2} - 2 + (\alpha + 3)\left(\frac{1 - e^{-\alpha}}{\alpha}\right) - \left(\frac{1 - e^{-\alpha}}{\alpha}\right)^2} \approx \sqrt{\frac{\alpha^2}{12} + \frac{\alpha}{2}}.
$$

## Exercise 9.36

In the section about the expected maximum occupancy of urns, the book claims that the length of the longest list when Program 9.1 is used will be $$\sim \ln N/\ln \ln N$$, on the average. If we compute two hash functions and put the key on the shorter of the two lists, then the longest list drops to $$\mathcal{O}(\ln \ln N)$$, on the average, with high probability. The details are given in the paper [Balanced Allocations](https://homes.cs.washington.edu/~karlin/papers/AzarBKU99.pdf).

## Exercise 9.37

The Cayley function is defined as $$C(z) = z e^{C(z)}$$ (see Theorem 6.14 in the book). We’ve to extract a coefficient from a function composed with an implicitly defined generating function. Therefore, we need to use the 3rd case of the Lagrange inversion theorem with $$g(u)=e^{\alpha u}$$ and $$f(u)=u/e^u$$. This gives

$$
\begin{align\*}
\[z^n] e^{\alpha C(z)} &= \frac{1}{n} \[u^{n-1}]  \alpha e^{\alpha u} e^{nu}  \\\[0.3cm]
&= \frac{\alpha}{n} \[u^{n-1}] e^{(\alpha + n)u} \\\[0.3cm]
&= \frac{\alpha (\alpha + n)^{n-1}}{n!}.
\end{align\*}
$$

## Exercise 9.38 🌟

{% hint style="success" %}
This exercise introduces Abel’s binomial theorem and shows how GFs help in proving seemingly daunting identities.
{% endhint %}

The LHS follows from the previous exercise

$$
\[z^n]e^{(\alpha + \beta)C(z)} = \frac{(\alpha + \beta)(n + \alpha + \beta)^{n-1}}{n!}.
$$

The RHS is a classical convolution of GFs, where we also reuse the result from the previous exercise. We’ve

$$
\[z^n]\left(e^{\alpha C(z)} e^{\beta C(z)}\right) = \sum\_{k=0}^n \left( \[z^k]e^{\alpha C(z)} \right) \left( \[z^{n-k}]e^{\beta C(z)} \right)= \sum\_{k=0}^n \frac{\alpha (k + \alpha)^{k-1}}{k!} \frac{\beta (n - k + \beta)^{n-k-1}}{(n-k)!}.
$$

Now, we just need to equate and simplify the LHS and RHS to get

$$
\begin{align\*}
\frac{(\alpha + \beta)(n + \alpha + \beta)^{n-1}}{n!} &= \sum\_{k=0}^n \frac{\alpha (k + \alpha)^{k-1}}{k!} \frac{\beta (n - k + \beta)^{n-k-1}}{(n-k)!} \\
(\alpha + \beta)(n + \alpha + \beta)^{n-1} &= \sum\_{k=0}^n \frac{n!}{k!(n-k)!} \alpha (k + \alpha)^{k-1} \beta (n - k + \beta)^{n-k-1} \\
&= \alpha \beta \sum\_{k=0}^n \binom{n}{k} (k + \alpha)^{k-1} (n - k + \beta)^{n-k-1}.
\end{align\*}
$$

## Exercise 9.39

Since "average search cost" can refer to either finding a key that is in the table (successful) or verifying a key is not in the table (unsuccessful), we cover both scenarios based on Table 9.9 from the book. The objective is to find the point where the cost equals $$\ln N$$.

The threshold for a successful search is

$$
\frac{1}{2}\left(1 + \frac{1}{1-\alpha}\right) = \ln N \implies \alpha = 1 - \frac{1}{2 \ln N - 1}.
$$

Substituting $$\alpha ≡ N/M$$ and solving for $$N$$, we get the implicit equation

$$
N = M \left( 1 - \frac{1}{2 \ln N - 1} \right).
$$

The threshold for an unsuccessful search is handled in virtually the same manner. Thus,

$$
N = M \left( 1 - \frac{1}{\sqrt{2 \ln N - 1}} \right).
$$

## Exercise 9.40

Program 9.2 will enter an infinite loop, which means the cost is infinite.

{% hint style="info" %}
Evaluating the exact formula from Table 9.9 for this edge case gives back a finite value. The reason is that the precondition $$N\<M$$ is violated, hence the output is meaningless.
{% endhint %}

## Exercise 9.41

We want to construct the EGF with respect to $$N$$, treating the table size $$M$$ as a fixed parameter. The method is virtually the same as in the case of a successful search demonstrated in the book. This gives

$$
\begin{align\*}
U\_M(z) &= \sum\_{N \ge 0} U\_{M,N} \frac{z^N}{N!} \\\[0.4cm]
&= \frac{1}{2} \sum\_{N \ge 0} \frac{z^N}{N!} + \frac{1}{2} \sum\_{N \ge 0} \sum\_{k=0}^N \frac{N!}{k!(N-k)!} \frac{(k+1)!}{M^k} \frac{z^N}{N!} && \text{(see Table 9.9 and reported errata)} \\\[0.4cm]
&= \frac{1}{2} e^z + \frac{1}{2} \sum\_{k \ge 0} \frac{k+1}{M^k} \sum\_{N=k}^\infty \frac{z^N}{(N-k)!} \\\[0.4cm]
&= \frac{1}{2} e^z + \frac{1}{2} \sum\_{k \ge 0} \frac{k+1}{M^k} z^k \sum\_{j \ge 0} \frac{z^j}{j!} \\\[0.4cm]
&= \frac{1}{2} e^z + \frac{1}{2} e^z \sum\_{k \ge 0} (k+1) \left(\frac{z}{M}\right)^k \\\[0.4cm]
&= \frac{1}{2} e^z \left( 1 + \frac{1}{(1 - z/M)^2} \right) && \text{$\left(\sum\_{k \ge 0} (k+1)x^k = \frac{1}{(1-x)^2}\right)$}.
\end{align\*}
$$

## Exercise 9.42

The "solution" is given in the matching footnote in the book.&#x20;

When you build an EGF using the symbolic method, the structures must physically make sense. It must natively encode the hard stop at $$N=M$$ while simultaneously yielding a clean, infinite-series function. Here is the pertinent citation from the book regarding the explicit EGF for a successful search:

> This is not directly meaningful for linear probing because the quantities are defined only for N ≤ M but it would seem a fine candidate for a combinatorial interpretation.

We had encountered a similar situation with CFGs (see [Exercise 8.35](https://evarga.gitbook.io/sh-aofa/algorithms-and-combinatorial-structures/pages/s3nxjJNwCnberr3JL8D4#exercise-8.35)). Employing BGFs may solve the spatial constraints.

Nonetheless, there is another fundamental issue. An inability of the symbolic method's grammar to directly specify the historical averaging process. Deriving the successful search cost requires performing a discrete integration over the timeline of the table's construction; it was computed in the book by averaging all unsuccessful searches that happened while building the table. But this is evidently a step outside the symbolic framework to finish the math.

{% hint style="info" %}
Some attempts have already been made with the combinatorial structure called an increasing tree to track history. The problem with linear probing is that disjoint clusters can merge over time and this introduces global dependencies that are difficult to handle.
{% endhint %}

I think, that aggressively forcing some method beyond its capabilities could jeopardize the core principles that were in place originally. Even if new operators and extensions may encompass the above features, it could be at the expense of comprehensibility and succinctness of the current grammar. This is reminiscent to an old battle in computer science with every [domain specific language](https://en.wikipedia.org/wiki/Domain-specific_language): "Should it be more powerful?" Once a DSL becomes Turing complete, abuses of the grammar happen, and things start to fall apart. One notable example is the C++'s templating engine, that inadvertently became overly capable. Now, a simple syntax error in a template can generate hundreds of lines of incomprehensible compiler errors.

## Exercise 9.43

Let’s use the symbolic method applied to the occupancy model (see Theorem 9.5). The EBGF for a single integer (urn) is $$\epsilon + u\times \text{SET}\_{>0}(\mathcal{Z})=1 + u(e^z - 1)$$.  Because the mapping consists of $$N$$ independent urns (the elements of the image), the EBGF for the entire mapping is simply the single-urn EBGF raised to the power of $$N$$. This gives

$$
C\_N(z, u) = (1 + u(e^z - 1))^N.
$$

It’s easy to verify the correctness of this EBGF by extracting the coefficient of $$u^k \frac{z^N}{N!}$$. Expanding the binomial and applying the identity from Table 3.6 for Stirling numbers of the second kind gives

$$
C\_N(z, u) = \sum\_{k=0}^N \binom{N}{k} u^k (e^z - 1)^k= \sum\_{k=0}^N \binom{N}{k} u^k \left( k! \sum\_{j \ge k} \genfrac{{}{}}{0pt}{}{j}{k} \frac{z^j}{j!} \right) \\\[0.4cm]
\therefore \left\[u^k \frac{z^N}{N!}\right] C\_N(z, u) = k! \binom{N}{k} \genfrac{{}{}}{0pt}{}{N}{k}.
$$

Now, performing the previous transformations in an opposite direction, starting from the formula for $$C\_{Nk}$$, we can arrive at the EBGF $$C\_N(z,u)$$.

## Exercise 9.44

See the previous exercise, which actually combines two approaches inside a single solution.

## Exercise 9.45

$$
\begin{align\*}
C\_{Nk} &= k! \binom{N}{k} \genfrac{{}{}}{0pt}{}{N}{k} \\\[0.4cm]
&= k! \binom{N}{k} \left\[ k \genfrac{{}{}}{0pt}{}{N-1}{k} + \genfrac{{}{}}{0pt}{}{N-1}{k-1} \right] && \text{(by Exercise 3.72)} \\\[0.4cm]
&= k \cdot \frac{N}{N-k} \cdot \left\[ k! \binom{N-1}{k} \genfrac{{}{}}{0pt}{}{N-1}{k} \right] + \frac{N}{k} \cdot k \left\[ (k-1)! \binom{N-1}{k-1} \genfrac{{}{}}{0pt}{}{N-1}{k-1} \right] \\\[0.4cm]
&= \frac{N \cdot k}{N-k} C\_{(N-1)k} + N C\_{(N-1)(k-1)}.
\end{align\*}
$$

While the above equation is the exact direct recurrence, it introduces a division by $$N-k$$. When programming this to generate a table, floating-point division can lead to precision errors for large combinatorial integers. Let’s use the substitution $$D\_{Nk}=C\_{Nk}/ \binom{N}{k}.$$ We can leverage again the following two identities:

* $$\binom{N}{k} = \frac{N}{N-k} \binom{N-1}{k}$$
* $$\binom{N}{k} = \frac{N}{k} \binom{N-1}{k-1}$$

This gives $$D\_{Nk}=k D\_{(N-1)k} + k D\_{(N-1)(k-1)}$$. The following Python script produces the table of mappings for $$N<20$$.

```python
import math

def generate_mappings(max_N):
    d = [[0] * (max_N + 1) for _ in range(max_N + 1)]
    d[1][1] = 1
    for n in range(2, max_N + 1):
        for k in range(1, n + 1):
            d[n][k] = k * d[n - 1][k] + k * d[n - 1][k - 1]

    for n in range(1, max_N + 1):
        row = []
        for k in range(1, n + 1):
            c = math.comb(n, k) * d[n][k]
            row.append(str(c))
        print(f"N={n}: " + ", ".join(row))

generate_mappings(19)
```

## Exercise 9.46

The explicit expression is

$$
\binom{M}{k} k! \genfrac{{}{}}{0pt}{}{N}{k}=M^{\underline{k}} \genfrac{{}{}}{0pt}{}{N}{k}.
$$

The combinatorial reasoning is analogous to the number of mappings, as explained in the book.

## Exercise 9.47

A random $$N$$-mapping is any function $$f$$ with the integers 1 to $$N$$ as both domain and range. Run the following procedure:

1. Initialize an empty set $$S$$ and a current value $$u = u\_0$$ (seed).
2. While $$u \notin S$$:
   1. Insert $$u$$ into $$S$$
   2. Set $$u = f(u)$$

Because $$S$$ can hold a maximum of $$N$$ distinct integers, the `while` loop can run a maximum of $$N$$ times. Upon the $$(N+1)$$th iteration (at the absolute latest), $$u$$ must evaluate to a number already in $$S$$ by the Pigeonhole principle. The loop terminates, and the cycle is found.

## Exercise 9.48

The following Python script reports some properties about the specified random mappings for various input sizes.

{% code expandable="true" %}

```python
def analyze_mapping(n):
    f = lambda i: 1 + (i * i + 1) % n

    # 1. Image ratio (alignment with uniform distribution)
    image = {f(i) for i in range(1, n + 1)}
    image_ratio = len(image) / n

    # 2. Cycle detection
    visited_global = set()
    cycle_lengths = []
    for i in range(1, n + 1):
        if i not in visited_global:
            path_visited = {}
            u = i
            step = 0

            # Trace the path until we hit a previously seen node.
            while u not in visited_global and u not in path_visited:
                path_visited[u] = step
                u = f(u)
                step += 1

            # If the node was seen in THIS specific path, we found a new cycle.
            if u in path_visited:
                cycle_len = step - path_visited[u]
                cycle_lengths.append(cycle_len)

            # Mark the entire path as globally visited.
            visited_global.update(path_visited.keys())

    return {
        "N": n,
        "Shortest Cycle": min(cycle_lengths) if cycle_lengths else 0,
        "Longest Cycle": max(cycle_lengths) if cycle_lengths else 0,
        "Total Cycles": len(cycle_lengths),
        "Image Ratio": image_ratio
    }

test_values = [100, 101, 1000, 1009, 10000, 10007]
print(f"{'N':<7} | {'Image Ratio':<12} | {'Total Cycles':<13} | {'Shortest':<9} | {'Longest'}")
print("-" * 60)
for val in test_values:
    res = analyze_mapping(val)
    print(
        f"{res['N']:<7} | {res['Image Ratio']:<12.3f} | {res['Total Cycles']:<13} | {res['Shortest Cycle']:<9} | {res['Longest Cycle']}")
```

{% endcode %}

It produces the following output:

<pre data-expandable="true"><code>N       | Image Ratio  | Total Cycles  | Shortest  | Longest
------------------------------------------------------------
100     | 0.220        | 4             | 2         | 8
<strong>101     | 0.505        | 2             | 6         | 6
</strong>1000    | 0.159        | 6             | 2         | 40
<strong>1009    | 0.500        | 5             | 1         | 13
</strong>10000   | 0.104        | 8             | 2         | 200
<strong>10007   | 0.500        | 3             | 1         | 55
</strong></code></pre>

A truly uniform random mapping has an image ratio $$1 - 1/e \approx 0.632$$, on the average. This follows from the corollary of Theorem 9.5 with $$\alpha=1$$ pertaining to the expected number of empty urns. Thus, a random mapping will leave about 36.8% of the range completely empty.

## Exercise 9.49 🌟

{% hint style="success" %}
This exercise introduces the [principle of deferred decisions](https://people.cs.umass.edu/~mcgregor/690RAS20/lec-deferred.pdf), a powerful technique in analyzing randomized algorithms.
{% endhint %}

Imagine we’re constructing our random mapping $$f: {1, 2, \dots, N} \to {1, 2, \dots, N}$$ dynamically, rather than generating the whole thing up front. Starting from an arbitrary seed $$x\_0$$, we generate subsequent values on the fly, essentially simulating each time an independent roll of an $$N$$-sided die.

The isomorphism to the birthday problem is based on the following observations:

* In the mapping, the table size is $$N$$. In the birthday problem, it’s the $$N$$ days of the year.
* In the mapping, we’re taking a step to a new node $$x\_k$$. In the birthday problem, it’s a new person walking in.
* In the mapping, we hit a cycle the moment $$f$$ outputs a node $$x\_j$$ we’ve already visited. In the birthday problem, we stop the moment a person announces a birthday that someone else in the room already has.

Let $$T$$ be the number of distinct nodes visited (the rho length) before closing the cycle. Let $$B$$ be the number of people in the room before the first shared birthday is found. Their probability distributions are identical

$$
\Pr{T > k} \equiv \Pr{B > k}.
$$

{% hint style="danger" %}
The book contains, at the time of this writing, an unreported errata. Namely, the fomula in Theorem 9.9 should read as $$\Pr{\text{rho length} > k} = \left(\frac{N-1}{N}\right) \left(\frac{N-2}{N}\right) \dots \left(\frac{N-k}{N}\right)$$.
{% endhint %}

Using the tail sum formula for expectation, we can also conclude that

$$
\mathbb{E}\[T] = \mathbb{E}\[B] = \sum\_{k=0}^N Pr{B > k} \sim \sqrt{\frac{\pi N}{2}}.
$$

## Exercise 9.50

The minimal rho length is achieved for identity mapping, resulting in the rho length of $$N$$. The maximal length is attained for a random mapping being a permutation with one cycle. The length in this case is $$N^2$$.

The minimal tree length is achieved for a random mapping being a permutation; all tails are empty, so the length is zero. The maximal length is attained for a random mapping being a single linear chain tree that terminates in a fixed point. The length is $$N(N-1)/2$$.

## Exercise 9.51

The next Python script computes the relevant statistics. The theoretical asymptotic estimates are based on Theorem 9.10 and Table 9.12 from the book. The averages are reported as seen from a random point.

{% code expandable="true" %}

```python
import random
import math

def simulate_random_mappings(n, trials=1000, seed=10):
    # Set the seed to make the experiment reproducible.
    random.seed(seed)

    total_cycles = 0
    total_avg_rho = 0
    total_avg_tail = 0

    for trial in range(trials):
        # Generate a 0-indexed random N-mapping.
        # f[i] = j means node i maps to node j
        f = [random.randrange(n) for _ in range(n)]

        # Memoization arrays: -1 means unvisited
        rho = [-1] * n
        tail = [-1] * n
        mapping_cycles = 0

        for i in range(n):
            # Skip if already computed.
            if rho[i] != -1:
                continue

            path = []
            path_map = {}
            u = i

            # Traverse until we hit a known node or form a new cycle.
            while rho[u] == -1 and u not in path_map:
                path_map[u] = len(path)
                path.append(u)
                u = f[u]

            # Case 1: We hit a node currently in our path (new cycle found).
            if u in path_map:
                mapping_cycles += 1
                cycle_start_idx = path_map[u]
                cycle_len = len(path) - cycle_start_idx

                # Assign metrics to the nodes inside the cycle.
                for j in range(cycle_start_idx, len(path)):
                    node = path[j]
                    tail[node] = 0
                    rho[node] = cycle_len

                # Prepare metrics for the tail nodes leading into this cycle.
                next_tail = 1
                next_rho = cycle_len + 1
                path = path[:cycle_start_idx] # Keep only the tail nodes.
            # Case 2: We hit a previously processed node.
            else:
                next_tail = tail[u] + 1
                next_rho = rho[u] + 1

            # Backtrack through the tail, assigning lengths.
            for node in reversed(path):
                tail[node] = next_tail
                rho[node] = next_rho
                next_tail += 1
                next_rho += 1

        # Accumulate the averages for this specific mapping.
        total_cycles += mapping_cycles
        total_avg_rho += sum(rho) / n
        total_avg_tail += sum(tail) / n

    # Calculate global averages across all trials.
    sim_cycles = total_cycles / trials
    sim_rho = total_avg_rho / trials
    sim_tail = total_avg_tail / trials

    # Calculate theoretical asymptotic Limits.
    theo_cycles = 0.5 * math.log(n)
    theo_rho = math.sqrt(math.pi * n / 2)
    theo_tail = math.sqrt(math.pi * n / 8)

    print(f"--- Results for N = {n:,} ({trials} trials) ---")
    print(f"{'Metric':<20} | {'Simulation Avg':<15} | {'Theoretical Limit':<15}")
    print("-" * 58)
    print(f"{'Cycles (Components)':<20} | {sim_cycles:<15.3f} | {theo_cycles:<15.3f}")
    print(f"{'Rho Length':<20} | {sim_rho:<15.3f} | {theo_rho:<15.3f}")
    print(f"{'Tree Path (Tail)':<20} | {sim_tail:<15.3f} | {theo_tail:<15.3f}")

if __name__ == "__main__":
    simulate_random_mappings(100_000, 1_000)
```

{% endcode %}

The program generates the following output:

{% code expandable="true" %}

```
--- Results for N = 100,000 (1000 trials) ---
Metric               | Simulation Avg  | Theoretical Limit
----------------------------------------------------------
Cycles (Components)  | 6.518           | 5.756          
Rho Length           | 392.336         | 396.333        
Tree Path (Tail)     | 198.105         | 198.166
```

{% endcode %}

## Exercise 9.52

Wikipedia covers the topic of [cycle detection](https://en.wikipedia.org/wiki/Cycle_detection) in some detail. It presents several algorithms that may answer the query of this exercise using only O(1) extra space for a random point. Iterating over all points and summing the results provides the required answer.

## Exercise 9.53

The linear time algorithm using linear extra space is depicted below. For clarity the two stages are kept separate, although they could be bundled to avoid traversing the array twice.

{% stepper %}
{% step %}

### Find the root

Iterate through the mapping array `f` once to count how many nodes satisfy `f[i] == i`. If the count is different than 1, then announce that the mapping isn’t a tree and stop.
{% endstep %}

{% step %}

### Check if all nodes connect to the root

Iterate through the mapping array `f` once again and associate `(i, f[i])` to belong to the same group. For this purpose, use the *union-find* data structure (see Section 6.13 in the book). If the mapping is a tree, there should be only one group formed.
{% endstep %}
{% endstepper %}

## Exercise 9.54

The next Python script uses the union-find data structure to track groups and their sizes (see also the previous exercise). The `find` method employs path compression to improve performance.

{% code expandable="true" %}

```python
import random

def get_largest_component_size(n, trials=1000):
    total_max_comp_size = 0

    for _ in range(trials):
        # Generate a random N-mapping (0-indexed).
        f = [random.randrange(n) for _ in range(n)]

        parent = list(range(n))
        size = [1] * n

        def find(i):
            if parent[i] == i:
                return i
            parent[i] = find(parent[i])
            return parent[i]

        def union(i, j):
            root_i = find(i)
            root_j = find(j)
            if root_i != root_j:
                parent[root_j] = root_i
                size[root_i] += size[root_j]

        for i in range(n):
            union(i, f[i])
        total_max_comp_size += max(size[find(i)] for i in range(n))

    return total_max_comp_size / trials

print(f"{'N':<5} | {'Avg Size of Largest Component'}")
print("-" * 37)
random.seed(10)  # Set seed for reproducibility.
for n in range(1, 10):
    avg_size = get_largest_component_size(n, trials=1000)
    print(f"{n:<5} | {avg_size:.3f}")
```

{% endcode %}

The program generates the following output:

{% code expandable="true" %}

```
N     | Avg Size of Largest Component
-------------------------------------
1     | 1.000
2     | 1.771
3     | 2.588
4     | 3.338
5     | 4.142
6     | 4.914
7     | 5.769
8     | 6.529
9     | 7.277
```

{% endcode %}

The computed averages nicely align with the prediction from Table 9.12 even for small $$N$$.

## Exercise 9.55 🌟

{% hint style="success" %}
This exercise formally introduces the [pointing operator](https://en.wikipedia.org/wiki/Symbolic_method_\(combinatorics\)#Other_elementary_constructions) $$\Theta$$ of the symbolic method. It was used many times before, but here it has a special combinatorial role.
{% endhint %}

There are several derivation stages that build upon the idea of looking at random mappings as sets of cycles of Cayley trees.

#### The Structural Decomposition (the "Rho" Shape)

To find the expected rho length of a random point, we first need to isolate a single, random node in the mapping. This is called a pointed mapping. When you point to a random node $$x$$ and trace its path, it forms a sequence of nodes (the tail) that eventually falls into a cycle. Let's break this pointed path down into Cayley trees:

* **The Tail**: The path from $$x$$ to the cycle is effectively a branch of a tree. If this path has $$t$$ nodes, it is structurally equivalent to a sequence of exactly $$t$$ trees rooted along the path.
* **The Cycle**: The cycle has some length $$c$$. The node where the tail meets the cycle accounts for 1 tree. The remaining nodes in the cycle correspond to a sequence of $$c-1$$ trees.&#x20;

Notice that the total number of trees forming this entire backbone is $$t + c - 1$$. Therefore, the rho length is exactly equal to the total number of trees in this specific pointed component.

#### Formulation of the BGF

The EGF that enumerates cycles of trees (unpointed components) is from the book

$$
U(z) = \sum\_{k \ge 1} \frac{C(z)^k}{k}=\ln \frac{1}{1 - C(z)}=\sum\_{N \ge 1} U\_N\frac{z^N}{N!}.
$$

$$U\_N$$ is the number of valid cycles of trees you can build using $$N$$ nodes. Let’s apply the pointing operator $$\Theta$$ on $$U(z)$$ to enumerate pointed components. This gives

$$
\begin{align\*}
P(z) &= z \frac{d}{dz} \left( \ln \frac{1}{1 - C(z)} \right) \\\[0.4cm]
&= \frac{z C'(z)}{1 - C(z)} \\\[0.4cm]
&= \frac{C(z)}{(1 - C(z))^2} && \text{$\left(z C'(z) = \frac{C(z)}{1 - C(z)}\right)$} \\\[0.4cm]
&= \sum\_{N \ge 1} P\_N \frac{z^N}{N!}.
\end{align\*}
$$

Combinatorially, $$P\_N=NU\_N$$ counts the total number of pointed connected components of size $$N$$. For every single one of those unpointed components, there are exactly $$N$$ different nodes you could choose to point to.

If we expand $$P(z)$$ as a geometric series, we get $$\sum\_k k C(z)^k$$. The power $$k$$ represents the number of trees in the component. Since we just proved that the number of trees equals the rho length, we can mark trees by substituting $$C(z)$$ with $$uC(z)$$. We’ve

$$
P(z, u) = \frac{u C(z)}{(1 - u C(z))^2}= \sum\_{N \ge 1} \sum\_{k \ge 1} P\_{Nk} u^k \frac{z^N}{N!}.
$$

Combinatorially, $$P\_{Nk}$$ counts the number of pointed connected components of size $$N$$ where the pointed node has a rho length of exactly $$k$$.

{% hint style="info" %}
But why is the coefficient $$k$$ in $$P(z) = \sum\_k k C(z)^k$$? There are exactly $$k$$ different ways to split a sequence of $$k$$ trees into "the tail" and "the cycle." This is why a component with a rho length of $$k$$ has exactly $$k$$ valid structural configurations! Observe that by pointing to a node we break the symmetry of a cycle (dividing by $$k$$ disappears in $$U(z))$$ and we allow a sequence of $$k$$ trees to form $$k$$ distinct rho shapes. Again, the unit of abstraction is a complete Cayley tree, thus all sub-branching is handled internally and tracked by $$z$$.
{% endhint %}

A pointed mapping consists of exactly one pointed component and a set of unpointed components.\
The EGF for the set of unpointed components is given in the book

$$
E(z) = \exp\left( \ln \frac{1}{1 - C(z)} \right) = \frac{1}{1 - C(z)}.
$$

Our final BGF is the product of these two GFs

$$
M(z, u) = \left( \frac{1}{1 - C(z)} \right) \frac{u C(z)}{(1 - u C(z))^2}= \sum\_{N \ge 1} \sum\_{k \ge 1} M\_{Nk} u^k \frac{z^N}{N!}.
$$

Combinatorially, $$M\_{Nk}$$ counts the number of random $$N$$-mappings, containing a pointed node, where the rho length of that pointed node is exactly $$k$$.

#### Asymptotic Estimation

The total sum of rho lengths across all possible mappings is

$$
H(z) = \left. \frac{\partial}{\partial u} M(z, u) \right|\_{u=1} = \frac{C(z) (1 + C(z))}{(1 - C(z))^4}.
$$

We should use here the general [singularity analysis](https://ac.cs.princeton.edu/online/slides/AC06-SA.pdf) method of analytic combinatorics. We know the Cayley tree function has a dominant singularity at $$z = 1/e$$, where its Taylor expansion behaves as:

$$
C(z) \sim 1 - \sqrt{2(1 - ez)}.
$$

Let's evaluate $$H(z)$$ near this singularity. This gives

$$
H(z) \sim \frac{2}{4(1 - ez)^2} = \frac{1}{2} (1 - ez)^{-2} \implies \[z^N] H(z) \sim \frac{1}{2} e^N N.
$$

We’re now ready for the final assembly by leveraging Stirling’s approximation of $$N!$$

$$
\text{Average Total Rho Length} = \frac{N!}{N^N} \left( \frac{1}{2} e^N N \right) \sim  \frac{\sqrt{2\pi N} (N/e)^N}{N^N} \left( \frac{1}{2} e^N N \right) =  N \sqrt{\frac{\pi N}{2}}.
$$

Dividing this total by $$N$$, we get the average rho length of a random point.

## Exercise 9.56

Based on the result of [Exercise 9.48](#exercise-9.48), we know that the image ratio for the first iteration is $$1-1/e$$, therefore the size of the input for the second iteration is $$k=N(1-1/e)$$. Using the same logic, the new image ratio is $$1-e^{-k/N}$$. Thus, the average number of different integers in the image when a random mapping is iterated twice is $$N(1-e^{-1+1/e}) \approx 0.47N$$.

## Exercise 9.57

This is documented as OEIS [A001372](https://oeis.org/A001372).

## Exercise 9.58 🌟

{% hint style="success" %}
This exercise introduces a handy variant of the Lagrange inversion theorem that avoids the derivative form (3rd case).
{% endhint %}

In a standard mapping, every single node has an out-degree of exactly 1. Consequently, every path is guaranteed to eventually lead to a cycle. In a partial mapping, some nodes have an out-degree of 0. These nodes act as "sinks" where the path simply stops.

This splits the graph into two entirely distinct types of connected components, which is best illuminated by the following symbolic expression (a Cayley tree is denoted by $$\mathcal{T\_C}$$):

$$
\mathcal{P}=\text{SET}(\text{CYC}\_{>0}(\mathcal{T\_C})) \times \text{SET}(\mathcal{T\_C}) \implies P(z)  = \frac{e^{C(z)}}{1 - C(z)}.
$$

We can use the alternative variant of the Lagrange inversion theorem

$$
\[z^N] \frac{g(C(z))}{1 - C(z) \frac{\phi'(C(z))}{\phi(C(z))}} = \[u^N] g(u) \phi(u)^N,
$$

where $$\phi(u)=u/f(u)=e^u$$ and $$g(u)=e^u$$. In our case, $$\phi'(u)/\phi(u)=1$$. This immediately gives

$$
N!\[z^N] P(z) = N!\[u^N] e^{(N+1)u} = (N+1)^N.
$$

## Exercise 9.59

By defining a sequence of $$2N$$ integers strictly bounded between 1 and $$N$$, the mapping $$f$$ forces a  "layered" topological structure:

* The Core $$1 \le x \le N$$: These $$N$$ nodes map exclusively to other nodes in the same range, so they form a standard random $$N$$-mapping.
* The Periphery $$N+1 \le x \le 2N$$: These $$N$$ nodes also map into the core range. Because no node can ever map to them, they are guaranteed to be absolute leaves (nodes with an in-degree of exactly 0).

Assume that "path length" means the total sum of the tree path lengths (the number of steps to reach a cycle) for all $$2N$$ nodes. Let $$T\_N$$ be the total tree path length of a standard random $$N$$-mapping. We know from Table 9.12 that its expected value is

$$
\mathbb{E}\[T\_N] \sim N \sqrt{\frac{\pi N}{8}}.
$$

Every peripheral node $$j$$ takes exactly 1 step to enter the core (by landing on its target $$f(j)$$). Once it lands there, it simply inherits the remaining path length of that target node. Therefore,

$$
\text{Total Peripheral Path Length} = \sum\_{j=N+1}^{2N} \left(1 + \frac{E\[T\_N]}{N}\right) = N \left(1 + \frac{E\[T\_N]}{N}\right) = N + E\[T\_N].
$$

Adding together the two pieces and simplifying gives

$$
\text{Total Path Length} \sim 2 \left( N \sqrt{\frac{\pi N}{8}} \right) + N = N \sqrt{\frac{\pi N}{2}} + N.
$$

The average path length as seen from a random point is the total divided by $$2N$$.

## Exercise 9.60

The next Python script empirically verifies the statistics given in Table 9.12.

{% code expandable="true" %}

```python
import random
import math

def simulate_table_9_12(domain, trials=100, k=2, seed=42):
    random.seed(seed)

    print(f"Running {trials} trials for each N. Using k={k} for k-statistics.\n")

    for n in domain:
        # Accumulators for "Average as seen from a random point"
        sum_avg_rho = sum_avg_tail = sum_avg_cycle = 0
        sum_avg_tree = sum_avg_comp = 0

        # Accumulators for "Average number of"
        sum_k_nodes = sum_k_cycles = 0
        sum_k_comps = sum_k_trees = 0

        # Accumulators for "Extremal parameters"
        sum_max_tail = sum_max_cycle = sum_max_rho = 0
        sum_max_tree = sum_max_comp = 0

        for _ in range(trials):
            f = [random.randrange(n) for _ in range(n)]

            # 1. In-degrees (for k-nodes)
            in_degree = [0] * n
            for v in f:
                in_degree[v] += 1
            sum_k_nodes += in_degree.count(k)

            # 2. Path tracing (Rho, Tail, Cycle)
            rho = [-1] * n
            tail = [-1] * n
            cycle_len = [-1] * n

            mapping_k_cycles = 0
            cycles_found = {}  # cycle_root -> length

            for i in range(n):
                if rho[i] != -1: continue
                path = []
                u = i
                while rho[u] == -1 and u not in path:
                    path.append(u)
                    u = f[u]

                if u in path:
                    idx = path.index(u)
                    c_len = len(path) - idx
                    if c_len == k: mapping_k_cycles += 1
                    for j in range(idx, len(path)):
                        node = path[j]
                        tail[node] = 0
                        rho[node] = c_len
                        cycle_len[node] = c_len
                        cycles_found[node] = c_len  # Mark cycle nodes
                    path = path[:idx]

                next_tail = tail[u] + 1
                next_rho = rho[u] + 1
                c_len_inherit = cycle_len[u]
                for node in reversed(path):
                    tail[node] = next_tail
                    rho[node] = next_rho
                    cycle_len[node] = c_len_inherit
                    next_tail += 1
                    next_rho += 1

            sum_k_cycles += mapping_k_cycles

            # 3. Component and Tree Sizes
            parent = list(range(n))
            comp_size = [1] * n

            def find(i):
                if parent[i] == i: return i
                parent[i] = find(parent[i])
                return parent[i]

            def union(i, j):
                root_i, root_j = find(i), find(j)
                if root_i != root_j:
                    parent[root_j] = root_i
                    comp_size[root_i] += comp_size[root_j]

            for i in range(n):
                union(i, f[i])

            # Tree sizes (size of the Cayley tree rooted on the cycle)
            tree_size = [0] * n
            for i in range(n):
                # Walk to the cycle to find which tree this node belongs to.
                u = i
                while tail[u] > 0:
                    u = f[u]
                tree_size[u] += 1

            # Distribute tree/comp sizes back to all nodes for "random point" averages.
            node_tree_size = [0] * n
            node_comp_size = [0] * n
            for i in range(n):
                u = i
                while tail[u] > 0: u = f[u]
                node_tree_size[i] = tree_size[u]
                node_comp_size[i] = comp_size[find(i)]

            # Count k-components (check roots of the disjoint sets)
            mapping_k_comps = sum(1 for i in range(n) if parent[i] == i and comp_size[i] == k)
            sum_k_comps += mapping_k_comps

            # Count k-trees (check tree sizes only at the cycle nodes where tail == 0)
            mapping_k_trees = sum(1 for i in range(n) if tail[i] == 0 and tree_size[i] == k)
            sum_k_trees += mapping_k_trees

            # Accumulate averages for this mapping.
            sum_avg_rho += sum(rho) / n
            sum_avg_tail += sum(tail) / n
            sum_avg_cycle += sum(cycle_len) / n
            sum_avg_tree += sum(node_tree_size) / n
            sum_avg_comp += sum(node_comp_size) / n

            # Accumulate extremals for this mapping.
            sum_max_tail += max(tail)
            sum_max_cycle += max(cycle_len)
            sum_max_rho += max(rho)
            sum_max_tree += max(tree_size)
            sum_max_comp += max(comp_size[find(i)] for i in range(n))

        print(f"=== Results for N = {n} ===")
        print(f"{'Metric':<25} | {'Empirical Avg':<15} | {'Theoretical limit'}")
        print("-" * 63)

        # Point averages
        print(f"Rho Length                | {sum_avg_rho / trials:<15.3f} | {math.sqrt(math.pi * n / 2):.3f}")
        print(f"Tail Length               | {sum_avg_tail / trials:<15.3f} | {math.sqrt(math.pi * n / 8):.3f}")
        print(f"Cycle Length              | {sum_avg_cycle / trials:<15.3f} | {math.sqrt(math.pi * n / 8):.3f}")
        print(f"Tree Size                 | {sum_avg_tree / trials:<15.3f} | {n / 3:.3f}")
        print(f"Component Size            | {sum_avg_comp / trials:<15.3f} | {2 * n / 3:.3f}")

        # K-metrics
        print(
            f"{k}-nodes                   | {sum_k_nodes / trials:<15.3f} | {n * math.exp(-1) / math.factorial(k):.3f}")
        print(f"{k}-cycles                  | {sum_k_cycles / trials:<15.3f} | {1 / k:.3f}")
        print(f"{k}-components              | {sum_k_comps / trials:<15.3f} | {1.5 * math.exp(-k):.3f}")
        print(
            f"{k}-trees                   | {sum_k_trees / trials:<15.3f} | {math.exp(-k) * math.sqrt(math.pi * n / 2):.3f}")

        # Extremals
        print(f"Longest Tail              | {sum_max_tail / trials:<15.3f} | {1.74 * math.sqrt(n):.3f}")
        print(f"Longest Cycle             | {sum_max_cycle / trials:<15.3f} | {0.78 * math.sqrt(n):.3f}")
        print(f"Longest Rho-path          | {sum_max_rho / trials:<15.3f} | {2.41 * math.sqrt(n):.3f}")
        print(f"Largest Tree              | {sum_max_tree / trials:<15.3f} | {0.48 * n:.3f}")
        print(f"Largest Component         | {sum_max_comp / trials:<15.3f} | {0.76 * n:.3f}")
        print("\n")

if __name__ == "__main__":
    simulate_table_9_12([10, 100, 1000], trials=100, k=2)
```

{% endcode %}

It produces the following output:

{% code expandable="true" %}

```
=== Results for N = 10 ===
Metric                    | Empirical Avg   | Theoretical limit
---------------------------------------------------------------
Rho Length                | 3.727           | 3.963
Tail Length               | 1.293           | 1.982
Cycle Length              | 2.434           | 1.982
Tree Size                 | 4.674           | 3.333
Component Size            | 7.556           | 6.667
2-nodes                   | 1.980           | 1.839
2-cycles                  | 0.520           | 0.500
2-components              | 0.210           | 0.203
2-trees                   | 0.590           | 0.536
Longest Tail              | 3.250           | 5.502
Longest Cycle             | 2.650           | 2.467
Longest Rho-path          | 5.760           | 7.621
Largest Tree              | 5.880           | 4.800
Largest Component         | 8.240           | 7.600


=== Results for N = 100 ===
Metric                    | Empirical Avg   | Theoretical limit
---------------------------------------------------------------
Rho Length                | 12.405          | 12.533
Tail Length               | 5.848           | 6.267
Cycle Length              | 6.557           | 6.267
Tree Size                 | 37.613          | 33.333
Component Size            | 69.806          | 66.667
2-nodes                   | 18.800          | 18.394
2-cycles                  | 0.600           | 0.500
2-components              | 0.190           | 0.203
2-trees                   | 1.700           | 1.696
Longest Tail              | 15.470          | 17.400
Longest Cycle             | 7.910           | 7.800
Longest Rho-path          | 22.530          | 24.100
Largest Tree              | 52.120          | 48.000
Largest Component         | 78.130          | 76.000


=== Results for N = 1000 ===
Metric                    | Empirical Avg   | Theoretical limit
---------------------------------------------------------------
Rho Length                | 37.989          | 39.633
Tail Length               | 20.102          | 19.817
Cycle Length              | 17.887          | 19.817
Tree Size                 | 368.728         | 333.333
Component Size            | 667.235         | 666.667
2-nodes                   | 184.900         | 183.940
2-cycles                  | 0.560           | 0.500
2-components              | 0.200           | 0.203
2-trees                   | 5.060           | 5.364
Longest Tail              | 55.700          | 55.024
Longest Cycle             | 22.610          | 24.666
Longest Rho-path          | 75.190          | 76.211
Largest Tree              | 518.800         | 480.000
Largest Component         | 761.410         | 760.000
```

{% endcode %}

## Exercise 9.61

Modern PRNGs, like [Mersenne Twister](https://en.wikipedia.org/wiki/Mersenne_Twister), don’t have short cycles. Therefore, the Python script below use less capable generators to demonstrate the issue of short cycles, although it does showcase a linear congruential generator whose image spans the range (this was mentioned in the book). The cycle detection algorithm is the one from [Exercise 9.52](#exercise-9.52).

{% code expandable="true" %}

```python
def floyd_cycle_finding(f, x0):
    """
    Executes Floyd's Tortoise and Hare algorithm to find cycles.
    Returns (mu, lambda) where mu is the tail length and lambda is the cycle length.
    """
    # Phase 1: Find a collision point inside the cycle.
    tortoise = f(x0)
    hare = f(f(x0))

    while tortoise != hare:
        tortoise = f(tortoise)
        hare = f(f(hare))

    # Phase 2: Find the start of the cycle (tail length, mu).
    mu = 0
    tortoise = x0
    while tortoise != hare:
        tortoise = f(tortoise)
        hare = f(hare)
        mu += 1

    # Phase 3: Find the length of the cycle (lambda).
    lam = 1
    hare = f(tortoise)
    while tortoise != hare:
        hare = f(hare)
        lam += 1

    return mu, lam

# ==========================================
# 1. Von Neumann's Middle-Square Generator
# ==========================================
def middle_square(x):
    # Square the 4-digit number (pad to 8 digits) and extract the middle 4.
    sq = str(x * x).zfill(8)
    return int(sq[2:6])

# ==========================================
# 2. 16-bit Linear Congruential Generator
# ==========================================
def lcg_16bit(x):
    # Parameters similar to older glibc implementations.
    a = 22695477
    c = 1
    m = 2 ** 16
    return (a * x + c) % m

# ==========================================
# 3. Bounded Modern PRNG (Random Mapping)
# ==========================================
import random

def create_bounded_rng(N):
    # Pre-generate a random mapping array to ensure deterministic evaluation.
    # This exactly mimics a random N-mapping graph.
    mapping = [random.randrange(N) for _ in range(N)]
    return lambda x: mapping[x]

# ==========================================
# Run the Diagnostics
# ==========================================
if __name__ == "__main__":
    print("Testing Von Neumann's Middle-Square (Seed = 1234):")
    mu, lam = floyd_cycle_finding(middle_square, 1234)
    print(f"Tail Length: {mu}, Cycle Length: {lam}")
    print(f"Rho Length: {mu + lam}\n")

    print("Testing 16-bit LCG (Seed = 42):")
    mu, lam = floyd_cycle_finding(lcg_16bit, 42)
    print(f"Tail Length: {mu}, Cycle Length: {lam}")
    print(f"Rho Length: {mu + lam}\n")

    n = 100_000
    print(f"Testing Bounded Python Random N-Mapping (N = {n}, Seed = 0):")
    bounded_f = create_bounded_rng(n)
    mu, lam = floyd_cycle_finding(bounded_f, 0)
    print(f"Tail Length: {mu}, Cycle Length: {lam}")
    print(f"Rho Length: {mu + lam}")
```

{% endcode %}

It produces the following output:

{% code expandable="true" %}

```
Testing Von Neumann's Middle-Square (Seed = 1234):
Tail Length: 56, Cycle Length: 1
Rho Length: 57

Testing 16-bit LCG (Seed = 42):
Tail Length: 0, Cycle Length: 65536
Rho Length: 65536

Testing Bounded Python Random N-Mapping (N = 100000, Seed = 0):
Tail Length: 91, Cycle Length: 20
Rho Length: 111
```

{% endcode %}

## Exercise 9.62

See the previous exercise. The middle-square method is famous for collapsing into extremely short cycles or hitting zero.

## Exercise 9.63

The next script demonstrates how "collapses" occur for non prime modulus. For a prime $$N$$, all values of $$c$$ produce an expected result $$O(\sqrt N)$$, on the average. The strangest behavior happens for a power of prime. At any rate, instead of averaging over many randomly selected values of $$c$$, the program shows how various choices influence the simulated random mapping.

{% code expandable="true" %}

```python
import math
import random

def floyd_cycle_finding(f, x0):
    tortoise = f(x0)
    hare = f(f(x0))

    while tortoise != hare:
        tortoise = f(tortoise)
        hare = f(f(hare))

    mu = 0
    tortoise = x0
    while tortoise != hare:
        tortoise = f(tortoise)
        hare = f(hare)
        mu += 1

    lam = 1
    hare = f(tortoise)
    while tortoise != hare:
        hare = f(hare)
        lam += 1
    return mu, lam

def create_quadratic_generator(c, n):
    return lambda x: (x ** 2 + c) % n

def test():
    test_cases = [
        (100003, "Prime"),
        (9991, "Semi-prime (97 x 103)"),
        (10000, "Highly Composite (10^4)"),
        (823543, "Power of Prime (7^7)")
    ]
    c_values = [1, 2, 3, 11]
    trials = 100

    print(f"{'N':<10} | {'Type':<25} | {'c':<3} | {'Empirical Rho':<15} | {'Theoretical (if prime)'}")
    print("-" * 87)

    for n, description in test_cases:
        theoretical_limit = math.sqrt(math.pi * n / 2)

        for c in c_values:
            f = create_quadratic_generator(c, n)
            total_rho = 0
            for _ in range(trials):
                x0 = random.randrange(n)
                mu, lam = floyd_cycle_finding(f, x0)
                total_rho += (mu + lam)
            avg_rho = total_rho / trials

            print(f"{n:<10} | {description:<25} | {c:<3} | {avg_rho:<15.2f} | {theoretical_limit:.2f}")
        print("-" * 87)

if __name__ == "__main__":
    test()
```

{% endcode %}

The program creates the following report:

```
N          | Type                      | c   | Empirical Rho   | Theoretical (if prime)
---------------------------------------------------------------------------------------
100003     | Prime                     | 1   | 475.26          | 396.34
100003     | Prime                     | 2   | 554.26          | 396.34
100003     | Prime                     | 3   | 243.27          | 396.34
100003     | Prime                     | 11  | 250.83          | 396.34
---------------------------------------------------------------------------------------
9991       | Semi-prime (97 x 103)     | 1   | 39.45           | 125.28
9991       | Semi-prime (97 x 103)     | 2   | 29.29           | 125.28
9991       | Semi-prime (97 x 103)     | 3   | 16.81           | 125.28
9991       | Semi-prime (97 x 103)     | 11  | 11.81           | 125.28
---------------------------------------------------------------------------------------
10000      | Highly Composite (10^4)   | 1   | 12.74           | 125.33
10000      | Highly Composite (10^4)   | 2   | 178.12          | 125.33
10000      | Highly Composite (10^4)   | 3   | 31.53           | 125.33
10000      | Highly Composite (10^4)   | 11  | 12.28           | 125.33
---------------------------------------------------------------------------------------
823543     | Power of Prime (7^7)      | 1   | 70988.38        | 1137.37
823543     | Power of Prime (7^7)      | 2   | 117651.02       | 1137.37
823543     | Power of Prime (7^7)      | 3   | 13.19           | 1137.37
823543     | Power of Prime (7^7)      | 11  | 8241.22         | 1137.37
---------------------------------------------------------------------------------------
```

## Exercise 9.99<sup>\*</sup>

The text of this exercise is available at the book’s [website](https://aofa.cs.princeton.edu/90maps/); it’s not in the book, but it should be. The EGF that defines random mappings with no singleton cycles, as a function of the Cayley function, is

$$
F(z)=\frac{e^{-C(z)}}{1-C(z)}.
$$

The alternative variant of the Lagrange inversion theorem (see [Exercise 9.58](#exercise-9.58)) immediately gives

$$
\text{Probability}=\frac{N!}{N^N} \[z^N]F(z)=\frac{N!}{N^N}\[u^N] e^{(N-1)u}=\frac{N!}{N^N} \cdot \frac{(N-1)^N}{N!}= \left( 1 - \frac{1}{N} \right)^N \sim 1/e.
$$

The last step follows by letting $$N \to \infty$$.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://evarga.gitbook.io/sh-aofa/algorithms-and-combinatorial-structures/chapter-9-words-and-mappings.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.