Chapter 2

Exercise 2.1

def fib_recursive(n):
    return fib_recursive(n - 1) + fib_recursive(n - 2) if n > 1 else n

def fib_iterative(n):
    a, b = 0, 1
    for _ in range(1, n + 1):
        a, b = b, a + b
    return a

The recursive version needlessly recomputes the same values and has an exponential running time. The iterative version eliminates this deficiency. Otherwise, both correctly return F20=6765F_{20}=6765.

Exercise 2.2

The total number of arithmetic operations (additions, subtractions and divisions), including loop increments and index calculations, is

CNmax=N=1Nmax(1+1+k=0N1(1+5))=Nmax(3Nmax+5).C_{N_{max}}=\sum_{N=1}^{N_{max}}\left(1+1+\sum_{k=0}^{N-1}(1+5)\right)=N_{max}(3N_{max}+5).

Exercise 2.3

def c(n):
    return (1 + 1 / n) * c(n - 1) + 2 if n > 1 else 2

It uses only 5(Nmax1)5(N_{max}-1) arithmetic operations to compute the same value as Program 2.1.

Exercise 2.4

A non-recursive program would require Θ(N)\Theta(N) time to compute values using recurrence (2) and Θ(N2)\Theta(N^2) for recurrence (3), as shown in Exercise 2.2.

A recursive program would also require Θ(N)\Theta(N) time to compute values using recurrence (2). Despite an exponentially growing number of subproblems, their sizes are also exponentially dropping. On the other hand, calculating values with recurrence (3), as mentioned in the book, has unacceptable performance. By instrumenting the code and counting the number of function calls (excluding the base cases) we get an interesting sequence of numbers for N[1,10]N \in [1, 10]. It is (1,3,9,27,81,243,729,2187,6561,19683)(1,3,9,27,81,243,729,2187,6561,19683), where the elements are powers of 3. Therefore, the number of operations is Θ(3N)\Theta(3^N). As a side note, we can formally prove this via induction on NN, using the recurrence describing the number of calls rN=1+2k=0N1rkr_N=1+2\sum_{k=0}^{N-1}r_k (with r0=0r_0=0) as the inductive hypothesis.

Qualifying variants with tight asymptotic bounds is convenient here. For example, we don’t need more details to understand that Θ(3N)\Theta(3^N) is prohibitive.

Exercise 2.5

We can’t directly implement the recurrence for the radix exchange sort using the technique demonstrated in Program 2.1. To calculate CNC_N we need to know CNC_N. Let’s transform it

CN=N+12Nk=0N(Nk)(Ck+CNk)=N+22Nk=0N(Nk)Ck((Nk)=(NNk))=N+12N1((k=0N1(Nk)Ck)+CN)=N+12N1(k=0N1(Nk)Ck)+12N1CN=2N12N11(N+12N1k=0N1(Nk)Ck).\begin{align*} C_N &= N+\frac{1}{2^N} \sum_{k=0}^N {\binom{N}{k}(C_k+C_{N-k})} \\ &= N+\frac{2}{2^N} \sum_{k=0}^N {\binom{N}{k}C_k} && \text{$\left(\binom{N}{k}=\binom{N}{N-k}\right)$} \\ &= N+\frac{1}{2^{N-1}} \left(\left(\sum_{k=0}^{N-1} {\binom{N}{k}C_k}\right)+C_N\right) \\ &= N+\frac{1}{2^{N-1}} \left(\sum_{k=0}^{N-1} {\binom{N}{k}C_k}\right)+\frac{1}{2^{N-1}}C_N \\ &= \frac{2^{N-1}}{2^{N-1}-1} \left(N+\frac{1}{2^{N-1}} \sum_{k=0}^{N-1} {\binom{N}{k}C_k}\right). \end{align*}
import java.math.BigDecimal;
import java.math.BigInteger;
import java.math.MathContext;
import java.util.HashMap;
import java.util.Map;

public class CalculateValues {
    private static final int MAXN = 5000;
    private static final int STEP = 1000;

    public static void main(String[] args) {
        System.out.printf("%8s | %8s | %8s| %8s%n", "n", "standard", "3-way", "radix");
        System.out.println("----------------------------------------");
        for (int n = STEP; n <= MAXN; n += STEP) {
            int m = n - STEP + 1;
            int standard = standardQuickSort(m, n);
            int threeWay = threeWayQuickSort(m, n);
            int radix = radixExchangeSort(m, n);
            System.out.printf("%8d | %8d | %8d| %8d%n", n, standard, threeWay, radix);
        }
    }

    private static final double[] c1 = new double[MAXN + 1];

    private static int standardQuickSort(int minN, int maxN) {
        for (int n = minN; n <= maxN; n++)
            c1[n] = (n + 1) * c1[n - 1] / n + 2;
        return (int) c1[maxN];
    }

    private static final double[] c2 = new double[MAXN + 1];

    private static int threeWayQuickSort(int minN, int maxN) {
        for (int n = Integer.max(minN, 4); n <= maxN; n++) {
            long d = (long) n * (n - 1) * (n - 2) / 6;
            c2[n] = n + 1;
            for (int k = 1; k <= n; k++)
                c2[n] += ((k - 1D) * (n - k) / d) * (c2[k - 1] + c2[n - k]);
        }
        return (int) c2[maxN];
    }

    private static final MathContext MC = MathContext.DECIMAL128;
    private static final Map<Long, BigInteger> CHOOSE_CACHE = new HashMap<>();

    private static BigInteger nChooseK(int n, int k) {
        if (k < 0 || k > n) return BigInteger.ZERO;
        if (k == 0 || k == n) return BigInteger.ONE;
        if (k > n - k) k = n - k;
        long key = (((long) n) << 32) | (k & 0xffffffffL);
        BigInteger cached = CHOOSE_CACHE.get(key);
        if (cached != null) return cached;
        BigInteger value = nChooseK(n - 1, k - 1).add(nChooseK(n - 1, k));
        CHOOSE_CACHE.put(key, value);
        return value;
    }

    private static final BigDecimal[] c3 = new BigDecimal[MAXN + 1];

    private static int radixExchangeSort(int minN, int maxN) {
        for (int i = minN; i <= maxN; i++) c3[i] = BigDecimal.ZERO;

        for (int n = Integer.max(minN, 2); n <= maxN; n++) {
            for (int k = 2; k < n; k++) {
                BigDecimal binom = new BigDecimal(nChooseK(n, k));
                c3[n] = c3[n].add(binom.multiply(c3[k], MC), MC);
            }
            BigDecimal f = new BigDecimal(BigInteger.ONE.shiftLeft(n - 1));
            BigDecimal d = f.subtract(BigDecimal.ONE);
            BigDecimal term1 = BigDecimal.valueOf(n).multiply(f).divide(d, MC);
            BigDecimal term2 = c3[n].divide(d, MC);
            c3[n] = term1.add(term2, MC);
        }
        return c3[maxN].toBigInteger().intValue();
    }
}

For efficiently calculating the binomial coefficients, the program employs memoization. Here’s the output:

Table 2.5 - Estimated average number of comparisons of quicksort variants
       n | standard |    3-way|    radix
----------------------------------------
    1000 |    12985 |     9684|    11297
    2000 |    28729 |    21734|    24596
    3000 |    45519 |    34680|    38649
    4000 |    62988 |    48209|    53193
    5000 |    80963 |    62171|    68101

Analysis of Table 2.5

Based on closed form formulas for quicksort variants as well as analysis of radix exchange sort, we’d expect that the average number of compares approaches the following asymptotic approximations for sufficiently large NN:

  • Standard quicksort: 2NlnN1.386NlgN\sim 2N\ln N \approx1.386N\lg N

  • Median-of-three quicksort: (12/7)NlnN1.188NlgN\sim (12/7)N\ln N \approx 1.188 N\lg N

  • Radix exchange sort: NlgN\sim N\lg N

Nonetheless, for smaller values of NN we can’t ignore lower-order terms, as the data in Table 2.5 shows. This exercise serves as a perfect example of why relying only on the leading asymptotic term can lead to the wrong conclusion for practical input sizes. The book highlights this fact many times. Gradual asymptotic expansion allows fine-tuning the model to reflect the desired accuracy in relation to input size ranges.

Exercise 2.6

Denote by UnU_n and VnV_n solutions to homogeneous recurrences unu_n and vnv_n, respectively, where f(x,y)=x+yf(x,y)=x+y.

un=f(un1,un2) for n>1 with u0=1 and u1=0,vn=f(vn1,vn2) for n>1 with v0=0 and v1=1.u_n=f(u_{n-1},u_{n-2}) \text{ for $n > 1$ with $u_0=1$ and $u_1=0$} ,\\ v_n=f(v_{n-1},v_{n-2}) \text{ for $n > 1$ with $v_0=0$ and $v_1=1$} .

According to the book, an=pUn+qVna_n=pU_n+qV_n, where Un=Fn1U_n=F_{n-1} and Vn=FnV_n=F_n, thus an=pFn1+qFna_n=pF_{n-1}+qF_n for n>1n>1. The reason why Un=Fn1U_n=F_{n-1} is because it starts as 1,0,1,1,..., hence behaves like the Fibonacci sequence from the second element.

Exercise 2.7

The homogeneous part remains the same as in Exercise 2.6. The constant term rr "obeys" the same Fibonacci like growth pattern, with a difference, that in each iteration we also add one additional instance of rr. Imagine always starting a new Fibonacci sequence and adding it to the existing count. We get an=pFn1+qFn+rk=0n1Fka_n=pF_{n-1}+qF_n+r\sum_{k=0}^{n-1}F_k for n>1n>1.

🌟 Exercise 2.8

This is a generalization of the previous two exercises. For ff linear, we can express the solution to the recurrence

an=f(an1,an2)     for n>1a_n=f(a_{n-1},a_{n-2}) \space\space\space\space \text{ for $n>1$}

in terms of a0a_0, a1a_1, f(0,0)f(0,0), UnU_n and VnV_n, where UnU_n and VnV_n are solutions to homogeneous recurrences unu_n and vnv_n, respectively.

un=f(un1,un2)f(0,0) for n>1 with u0=1 and u1=0,vn=f(vn1,vn2)f(0,0) for n>1 with v0=0 and v1=1.u_n=f(u_{n-1},u_{n-2})-f(0,0) \text{ for $n > 1$ with $u_0=1$ and $u_1=0$} ,\\ v_n=f(v_{n-1},v_{n-2})-f(0,0) \text{ for $n > 1$ with $v_0=0$ and $v_1=1$} .

Thus, an=a0Un+a1Vn+f(0,0)k=0n1Vka_n=a_0U_n+a_1V_n+f(0,0)\sum_{k=0}^{n-1}V_k for n>1n>1.

Exercise 2.9

This immediately reduces to a product

an=k=1nkk+2(for n>0)=1324n1n+1nn+2=n!(n+2)!/2=2(n+1)(n+2).\begin{align*} a_n &= \prod_{k=1}^n \frac{k}{k+2} && \text{(for $n>0$)}\\ &=\frac{1}{3} \cdot \frac{2}{4} \cdot\cdot\cdot \frac{n-1}{n+1} \cdot \frac{n}{n+2} \\ &= \frac{n!}{(n+2)!/2} \\ &= \frac{2}{(n+1)(n+2)}. \end{align*}

Exercise 2.10

This immediately reduces to a sum

an=1+k=1n(1)kk(for n>0)=1+(1+23++(1)nn)=1+{n2if n is even,n+12if n is odd.=1+(1)nn2.\begin{align*} a_n &= 1+\sum_{k=1}^n (-1)^kk && \text{(for $n>0$)}\\ &= 1+(-1+2- 3+\dots + (-1)^nn) \\ &= 1+\begin{cases} \dfrac{n}{2} &\text{if $n$ is even}, \\ -\dfrac{n+1}{2} &\text{if $n$ is odd}. \end{cases} \\[0.6cm] &=1+ (-1)^n\bigg\lceil \frac{n}{2} \bigg\rceil. \end{align*}

Exercise 2.11

The solution is posted on the book’s website.

Exercise 2.12

Following the hint from the book, we get

an2n=an12n1+12n(for n>1)=an22n2+12n1+12n==a121+122++12nan=k=0n12k=2n1.\begin{align*} \frac{a_n}{2^n} &= \frac{a_{n-1}}{2^{n-1}}+\frac{1}{2^n} && \text{(for $n>1$)}\\ &= \frac{a_{n-2}}{2^{n-2}}+\frac{1}{2^{n-1}}+\frac{1}{2^n} \\ &= \dots \\ &= \frac{a_1}{2^1}+\frac{1}{2^2}+\dots+\frac{1}{2^n} \\ &\therefore \boxed{a_n=\sum_{k=0}^{n-1}2^k=2^n-1}. \end{align*}

Exercise 2.13

If we multiply both sides by (n+1)(n+1) and iterate, we get

(n+1)an=nan1+(n+1)(for n>0)=(n1)an2+n+(n+1)=a0+2++n+(n+1)=k=1n+1k=(n+1)(n+2)2an=n+22.\begin{align*} (n+1)a_n &= na_{n-1}+(n+1) && \text{(for $n>0$)}\\ &=(n-1)a_{n-2}+n+(n+1) \\ &\dots \\ &= a_0+2+\dots+n+(n+1) \\ &= \sum_{k=1}^{n+1} k \\ &= \frac{(n+1)(n+2)}{2} \\ &\therefore \boxed{a_n=\frac{n+2}{2}}. \end{align*}

🌟 Exercise 2.14

We can directly write down the solution to

an=xnan1+yn     for n>ta_n=x_na_{n-1}+y_n \space\space\space\space \text{ for $n>t$}

in terms of the xx’s, the yy’s and the initial value ata_t. Following through the same steps as in the proof of Theorem 2.1 and stopping at ata_t, we get

an=yn+at(xnxn1xt+1)+t+1j<nyjxj+1xj+2xn for n>t.a_n=y_n+a_t(x_nx_{n-1}\dots x_{t+1})+\sum_{t+1 \le j <n}y_jx_{j+1}x_{j+2}\dots x_n \text{ for $n>t$}.

Exercise 2.15

After dividing both sides by nn we have

an=n+1nan1+2(for n>0)=2+2(n+1)(Hn1)(by Theorem 2.1)=2(n+1)(Hn+11)(2=2(n+1)1n+1).\begin{align*} a_n&=\frac{n+1}{n}a_{n-1}+2 && \text{(for $n>0$)}\\ &=2+2(n+1)(H_n-1) && \text{(by Theorem 2.1)} \\ &=2(n+1)(H_{n+1}-1) && \text{($2=2(n+1) \cdot \frac{1}{n+1}$)}. \end{align*}

🌟 Exercise 2.16

After dividing both sides by nn we get a first-order linear recurrence

an=n4nan1+12Hn for n>4 and an=0 for all n4.a_n=\frac{n-4}{n}a_{n-1}+12H_n \text{ for $n>4$ and $a_n=0$ for all $n \le 4$}.

We can now use the generalized Theorem 2.1 (see Exercise 2.14) with t=4t=4. The reduced summation factor sj+1=xj+1xj+2xns_{j+1}=x_{j+1}x_{j+2}\dots x_n equals to

sj+1=j3j+1j2j+2n4n=j(j1)(j2)(j3)n(n1)(n2)(n3)=(j4)(n4).\begin{align*} s_{j+1}&=\frac{j-3}{j+1} \cdot \frac{j-2}{j+2} \cdot \cdot \cdot \frac{n-4}{n} \\ &= \frac{j(j-1)(j-2)(j-3)}{n(n-1)(n-2)(n-3)} \\ &= \frac{\dbinom{j}{4}}{\dbinom{n}{4}}. \end{align*}

After substituting back sj+1s_{j+1} into the main recurrence, we have

an=12Hn+12(n4)j=5n1(j4)Hj=12(n4)j=5n(j4)Hj=12(n4)(j=4n((j4)Hj)(44)H4)=12(n4)((n+15)(Hn+115)H4)(by formula (6.70)).\begin{align*} a_n&=12H_n+\frac{12}{\dbinom{n}{4}}\sum_{j=5}^{n-1} {\binom{j}{4}H_j} \\ &=\frac{12}{\dbinom{n}{4}}\sum_{j=5}^n {\binom{j}{4}H_j} \\ &= \frac{12}{\dbinom{n}{4}}\left(\sum_{j=4}^n \left({\binom{j}{4}H_j}\right)-\binom{4}{4}H_4\right) \\[0.8cm] &= \frac{12}{\dbinom{n}{4}}\left(\binom{n+1}{5}\left(H_{n+1}-\frac{1}{5}\right)-H_4\right) && \text{(by formula (6.70))}. \end{align*}

The previous derivation has used the identity (6.70) from the book Concrete Mathematics: A Foundation for Computer Science (2nd ed.).

🌟 Exercise 2.17

After simplifying the expression for ANA_N, we get

AN=N6NAN1+2 for N>1 and A1=1.A_N=\frac{N-6}{N}A_{N-1}+2 \text{ for $N>1$ and $A_1=1$}.

Before proceeding we must resolve the base cases for N6N \le 6; we must ensure that the summation factor is non-zero. Using the recurrence for ANA_N we get A2=0,A3=2,A4=1,A5=9/5,A6=2A_2=0, A_3=2,A_4=1,A_5=9/5,A_6=2. Following the same steps from Exercise 2.16 (see also Table 2.2 from the book), we get

AN=2(N+1)7 for N>6.A_N=\frac{2(N+1)}{7} \text{ for $N>6$}.

🌟 Exercise 2.18

From the error term bn=anαb_n = a_n - \alpha we have an1=bn1+αa_{n-1} = b_{n-1} + \alpha. We can substitute it back into the main recurrence

an=11+an1=11+α+bn1.a_n = \frac{1}{1 + a_{n-1}} = \frac{1}{1 + \alpha + b_{n-1}}.

Therefore, we have

bn=anα=11+α+bn1α=1α(1+α+bn1)1+α+bn1=1αα2αbn11+α+bn1=α1+α+bn1bn1(1αα2=0).\begin{align*} b_n &= a_n - \alpha \\ &= \frac{1}{1 + \alpha + b_{n-1}} - \alpha \\ &= \frac{1 - \alpha(1 + \alpha + b_{n-1})}{1 + \alpha + b_{n-1}} \\ &= \frac{1 - \alpha - \alpha^2 - \alpha b_{n-1}}{1 + \alpha + b_{n-1}} \\ &= -\frac{\alpha}{1 + \alpha + b_{n-1}} \cdot b_{n-1} && \text{($1 - \alpha - \alpha^2 = 0$)}. \end{align*}

As nn \to \infty, anαa_n \to \alpha, thus bn0b_n \to 0; see Exercise 2.19 for explanation why anαa_n \to \alpha when nn \to \infty and a0(0,1]a_0\in (0,1]. For small bn1b_{n-1} we have

11+α+bn111+α    bnα1+αbn1.\frac{1}{1 + \alpha + b_{n-1}} \approx \frac{1}{1 + \alpha} \implies b_n \approx -\frac{\alpha}{1 + \alpha} b_{n-1}.

Notice that α=11+α    α1+α=α2\alpha = \frac{1}{1+\alpha} \implies \frac{\alpha}{1 + \alpha} = \alpha^2, so we can derive an approximate formula

bnα2bn1=(α2)nb0(iterating)=α2n(a0α).\begin{align*} b_n &\approx -\alpha^2 b_{n-1} \\ &= (-\alpha^2)^n b_0 && \text{(iterating)}\\ &=-\alpha^{2n}(a_0-\alpha). \end{align*}

At any rate, roughly, bnbn1b_n \approx b_{n-1}.

🌟 Exercise 2.19

The solution is known as the Dottie number. We must show that all conditions are satisfied for applying the Banach fixed point theorem; it then guarantees that the function cos(x)\cos(x) converges to a unique fixed point for any a0[0,1]a_0 \in [0,1].

Let f(x)=cos(x)f(x)=\cos(x) and x[0,1]x \in [0,1]. We know that f:[0,1][0,1]f:[0,1] \mapsto [0,1] and is differentiable on this whole interval. Thus, by the mean value theorem, for any distinct x,y[0,1]x,y \in [0,1], there’s a point c(x,y)c \in (x,y) such that f(x)f(y)=f(c)(xy)f(x)-f(y)=f'(c)(x-y). Consequently, we can always express the distance between f(x)f(x) and f(y)f(y) in terms of the distance between xx and yy. If f(c)<1|f'(c)| <1 for all c(0,1)c \in(0,1), then ff has a Lipschitz constant less than 1, therefore ff is a contraction mapping on [0,1][0,1]. This is indeed the case, since, f(x)=sin(x)f'(x)=\sin(x) and sin(1)<1\sin(1)<1.

We can also represent the continued fraction (mentioned in the book) as f(x)=1/(1+x)f(x)=1/(1+x) for x(0,1]x \in (0,1]. I’s easy to show that all conditions hold for applying the Banach theorem.

Exercise 2.20

The function f(x)=0.5(x1/x)f(x)=0.5(x-1/x) has a singularity at zero. As x1|x| \to 1, f(x)0|f(x)| \to 0 and as x0|x| \to 0, f(x)|f(x)| \to \infty. The sequence can’t converge; as soon as it starts to approach ±1, it’ll move to zero, from zero to large values and back again. These wild oscillations will continue forever. Of course, x=1|x|=1 would immediately "push" ff into singularity.

🌟 Exercise 2.21

We prove the lower bound in a similar fashion, as the book does for the upper bound

1an=1an1(11an1)(for n>0)=1an1+11an1<1an1+2(1an1>1/2 for sufficiently large n)<2(n+1)(telescoping sum)3n(for all n>1)an131n=Ω(1/n).\begin{align*} \frac{1}{a_n} &= \frac{1}{a_{n-1}} \left(\frac{1}{1-a_{n-1}}\right) && \text{(for $n>0$)}\\ &=\frac{1}{a_{n-1}}+\frac{1}{1-a_{n-1}} \\ &< \frac{1}{a_{n-1}}+2 && \text{($1-a_{n-1} > 1/2$ for sufficiently large $n$)} \\ &\dots \\ &< 2(n+1) && \text{(telescoping sum)} \\ &\le 3n && \text{(for all $n>1$)} \\ &\therefore \boxed{a_n \ge \frac{1}{3} \cdot \frac{1}{n} = \Omega(1/n)}. \end{align*}

Both asymptotic bounds match, hence an=Θ(1/n)a_n=\Theta(1/n). After computing initial terms, it seems that anc/na_n \sim c/n, where c=1c=1. Let’s prove that limnnan=n1/an=1\lim_{n \to \infty} na_n=\frac{n}{1/a_n}=1. According to the book, we know that an0a_n \to 0 as nn \to \infty, so 1/an1/a_n \to \infty. We can apply the Stolz–Cesàro theorem, which is a discrete version of L'Hôpital’s rule. The theorem states that if the limit of the differences n(n1)1/an1/an1\frac{n - (n-1)}{1/a_n - 1/a_{n-1}} exists, then limnn1/an\lim_{n \to \infty} \frac{n}{1/a_n} is equal to that same limit. The numerator is 1, and the denominator is 1/(1an1)1/(1-a_{n-1}) (see the derivation of the lower bound). Thus,

limn11/(1an1)=limn(1an1)=1.\lim_{n \to \infty} \frac{1}{1/(1-a_{n-1})} = \lim_{n \to \infty} (1-a_{n-1}) = 1.

By the Stolz–Cesàro theorem, since the limit of the differences is 1, the limit of the original sequence is also 1.

🌟 Exercise 2.22

First we prove that limnan=0\lim_{n \to \infty}a_n=0. For x(0,1]x \in (0,1] we have that 0<sin(x)<x0<sin(x)<x; the geometric proof relies on the definition of radian. Consequently, ana_n is a strictly decreasing sequence bounded from below by zero. According to the monotone convergence theorem, it converges to the infimum, which is zero.

Since an0a_n \to 0, we can use the Taylor series for sin(x)\sin(x) around x=0x=0:

sin(x)=xx33!+x55!=xx36+O(x5).\sin(x) = x - \frac{x^3}{3!} + \frac{x^5}{5!} - \dots = x - \frac{x^3}{6} + O(x^5).

We can substitute this expansion into the recurrence an=sin(an1)a_n = \sin(a_{n-1}) to get

anan1an136=an1(1an126).a_n \approx a_{n-1} - \frac{a_{n-1}^3}{6}=a_{n-1}\left(1 - \frac{a_{n-1}^2}{6}\right).

If we square both sides of the equation, then we can use the approximation (1x)212x(1-x)^2 \approx 1-2x for small xx. Thus,

an2an12(1an126)2=an12(1an123).\begin{align*} a_n^2 &\approx a_{n-1}^2\left(1 - \frac{a_{n-1}^2}{6}\right)^2 \\ &=a_{n-1}^2\left(1 - \frac{a_{n-1}^2}{3}\right). \end{align*}

Following the hint from the book, let’s flip the above equation to get

1an21an12(1an123)1an12(1+an123)(11x1+x for small x)=1an12+13.\begin{align*} \frac{1}{a_n^2} &\approx \frac{1}{a_{n-1}^2 \left(1 - \frac{a_{n-1}^2}{3}\right)} \\ &\approx \frac{1}{a_{n-1}^2} \left(1 + \frac{a_{n-1}^2}{3}\right) && \text{($\frac{1}{1-x} \approx 1+x$ for small $x$)} \\ &= \frac{1}{a_{n-1}^2} + \frac{1}{3}. \end{align*}

Now we can use the substitution bn=1/anb_n=1/a_n to make a telescoping sum

bn2bn12+13==1+n3.b_n^2 \approx b_{n-1}^2 + \frac{1}{3}=\dots=1+\frac{n}{3}.

Therefore, bn=O(n)    an=O(1/n)b_n=O(\sqrt n) \implies a_n=O(1/\sqrt n).

🌟 Exercise 2.23

If f(α)>1|f'(\alpha)| > 1, then α\alpha is a repelling fixed point of a differentiable function ff. The sequence an=f(an1)a_n = f(a_{n-1}) starting at any point a0a_0 near α\alpha (but not exactly α\alpha) will diverge from the fixed point α\alpha. To develop an intuition why, let’s use the familiar error term from the book bn=anαb_n=a_n-\alpha and monitor how it changes in each iteration.

an=f(an1)α+bn=f(α+bn1)f(α)+f(α)bn1(by Taylor expansion near αf(α+bn1)f(α)+f(α)bn1)bnf(α)bn1(α=f(α)).\begin{align*} a_n &= f(a_{n-1}) \\ \alpha + b_n &= f(\alpha + b_{n-1}) \\ &\approx f(\alpha) + f'(\alpha) \cdot b_{n-1} && \text{(by Taylor expansion near $\alpha$, $f(\alpha + b_{n-1}) \approx f(\alpha) + f'(\alpha) \cdot b_{n-1}$)} \\ &\therefore \boxed{b_n \approx f'(\alpha) \cdot b_{n-1}} && \text{($\alpha=f(\alpha$))}. \end{align*}

Since f(α)>1f'(\alpha) > 1 the error term grows over time instead of shrinking. In general, what happens in later iterations highly depends on the global properties of the function. The point is, once the error term becomes small, the above approximation for bnb_n becomes relevant and it’ll start growing again.

🌟 Exercise 2.24

The analysis relies on the approximation of the error term bnb_n from Exercise 2.23 and the precondition stated in the book that a0a_0 is sufficiently close to α\alpha. The three cases are:

  1. When 0<f(α)<10 < |f'(\alpha)| < 1 (simple convergence) the fixed point α\alpha is attracting and the sequence ana_n is guaranteed to converge to α\alpha. The error drops by a constant factor f(α)|f'(\alpha)| at each step, hence the number of significant digits increases by a fixed amount in each iteration.

  2. When f(α)=0f'(\alpha) = 0 (quadratic convergence) the fixed point α\alpha is super-attracting. Convergence is guaranteed and is extremely fast. Here, the error term must be more precisely described using a higher-order derivative, assuming ff is at least twice continuously differentiable, for example, bnf(α)2bn12b_n \approx \left| \frac{f''(\alpha)}{2} \right| \cdot b_{n-1}^2. The error at each step is proportional to the square of the previous error. Each iteration approximately doubles the number of significant digits available.

  3. When f(α)=1|f'(\alpha)| = 1 ("slow" convergence) the fixed point α\alpha is neutral. Convergence isn’t guaranteed by this criterion alone and depends on the higher-order terms of the function ff. If the sequence does converge, then the speed is sublinear. The shrinkage of the error term isn’t driven by a constant factor, but rather it decreases "harmonically" (see Exercise 2.21 and Exercise 2.22).

Testing Convergence Using Only a0a_0

Since α\alpha is unknown (it is usually the value we are trying to find), we cannot check f(α)f'(\alpha) directly. Instead, we must rely on theorems that predict convergence based on the starting region I=[a0δ,a0+δ]I = [a_0 - \delta, a_0 + \delta], for some constant δ>0\delta>0, that contains α\alpha (see Exercise 2.19).

For practical purposes, we should check the derivative f(a0)|f'(a_0)|. We have three major cases:

  1. If f(a0)<1|f'(a_0)| < 1, then the sequence will likely converge. However, we must ensure that the derivative stays less than 1 as the sequence moves. If it grows larger than 1 between a0a_0 and α\alpha, the sequence may diverge.

  2. If f(a0)>1|f'(a_0)| > 1, then the sequence will locally diverge (move away from a0a_0). It might jump to a different basin of attraction where the derivative is smaller, or it might diverge to infinity.

  3. If f(a0)=1|f'(a_0)| = 1, then we may anticipate a very slow convergence, although we cannot be sure.

At any rate, one viable approach is to monitor the step size during the iterative process. Assuming that ff is a contraction mapping with a Lipschitz constant 0k<10\le k<1, we have

f(x)f(y)kxy    an+1an=f(an)f(an1)kanan1.|f(x) - f(y)| \leq k|x - y| \implies |a_{n+1} - a_n| = |f(a_n) - f(a_{n-1})| \leq k|a_n - a_{n-1}|.

We want to estimate the distance between the current point ana_n and the fixed point α\alpha. Since the sequence converges to α\alpha, we can express α\alpha as the limit of the sequence an+ma_{n+m} as mm \to \infty. We can expand the distance anα|a_n - \alpha| as the sum of all future steps using the triangle inequality:

anα=anan+1+an+1an+2+anan+1+an+1an+2+an+2an+3+(kanan1)+(k2anan1)+(k3anan1)+(iterating the distances)=anan1(k+k2+k3+)=anan1k1k(i=1ki=k1k).\begin{align*} |a_n - \alpha| &= |a_n - a_{n+1} + a_{n+1} - a_{n+2} + \dots| \\ &\leq |a_n - a_{n+1}| + |a_{n+1} - a_{n+2}| + |a_{n+2} - a_{n+3}| + \dots \\ &\leq \left( k|a_n - a_{n-1}| \right) + \left( k^2|a_n - a_{n-1}| \right) + \left( k^3|a_n - a_{n-1}| \right) + \dots && \text{(iterating the distances)} \\ &= |a_n - a_{n-1}| \left( k + k^2 + k^3 + \dots \right) \\ &= |a_n - a_{n-1}| \cdot \frac{k}{1-k} && \text{$\left(\sum_{i=1}^{\infty} k^i = \frac{k}{1-k}\right)$}. \end{align*}

We’ve a clear stopping criteria, since we can check whether the current step size satisfies the inequality, given the desired accuracy anα=ϵ>0 |a_n - \alpha|=\epsilon>0. Usually, this should be attained in some predefined number of steps (maximum number of iterations) that could be specified as a parameter. The step sizes should strictly decrease, hence the ratio of subsequent values should be less than one. Of course, this happens when the sequence converges.

Exercise 2.25

The book mentions that restructuring the terms of a recurrence corresponds to factoring the polynomial whose degree equals the order of the recurrence. There are two worked-out examples:

  • For an=2n1=2n1na_n=2^n-1=2^n-1^n the "characteristic polynomial" is (12x)(1x)(1-2x)(1-x) in factored form.

  • For an=3n2na_n=3^n-2^n the "characteristic polynomial" is (13x)(12x)(1-3x)(1-2x) in factored form.

In our case, we need to work with a cubic polynomial

(14x)(13x)(12x)=1x09x1+26x224x3. (1-4x)(1-3x)(1-2x) = 1x^0-9x^1+26x^2-24x^3.

The term xdx^d is associated with anda_{n-d}. Therefore, we can immediately construct our recurrence

an=9an126an2+24an3       for n>2.a_n=9a_{n-1}-26a_{n-2}+24a_{n-3} \space\space\space\space\space\space\text{ for $n>2$}.

The initial values are

a0=4030+20=1a1=4131+21=3a2=4232+22=11.\begin{align*} a_0 &=4^0-3^0+2^0=1 \\ a_1 &=4^1-3^1+2^1=3 \\ a_2 &=4^2-3^2+2^2=11. \end{align*}

In practice, it’s more common to work with a characteristic equation whose roots correspond to terms in a solution. This is also employed later in the book.

🌟 Exercise 2.26

The main idea is the same as in Exercise 2.8. The notes for recitation 12, from the course 6.042/18.062J Mathematics for Computer Science (2005), summarize the methodology of solving inhomogeneous linear recurrences. It contains additional examples for each recurrence type.

The paper Some Techniques for Solving Recurrences by GL is a superb tutorial on solving recurrences with advanced material on systematic search for particular solutions of inhomogeneous linear recurrences. The text covers many recurrence types that occur in practical applications.

Exercise 2.27

We know that an=c03n+c12na_n=c_03^n+c_12^n, as stated in the book. Furthermore, we have

a0=c0+c1a1=3c0+2c1.\begin{align*} a_0&=c_0+c_1 \\ a_1&=3c_0+2c_1. \end{align*}

We need to work backwards to find initial conditions resulting in c0=0c_0=0 and c1=1c_1=1. We immediately get a0=1a_0=1 and a1=2a_1=2. There are no initial conditions for which the solution is an=2n1a_n=2^n-1.

🌟 Exercise 2.28

By tweaking initial conditions, we can fundamentally change the growth rate of the solution to

an=2an1+an22an3     for n>2.a_n=2a_{n-1}+a_{n-2}-2a_{n-3} \space\space\space\space \text{ for $n>2$}.

We use the same approach as in Exercise 2.27. Here are the initial conditions leading to each desired growth rate of ana_n:

  • Constant is achieved when a0=a1=a2=c0a_0=a_1=a_2=c_0. For example, a0=a1=a2=2a_0=a_1=a_2=2 makes an=2a_n=2.

  • Exponential is achieved when a0=c2a_0=c_2, a1=2c2a_1=2c_2 and a2=4c2a_2=4c_2. For example, a0=1a_0=1, a1=2a_1=2 and a2=4a_2=4 makes an=2na_n=2^n.

  • Fluctuating in sign is achieved when a0=c1a_0=c_1, a1=c1a_1=-c_1 and a2=c1a_2=c_1. For example, a0=1a_0=1, a1=1a_1=-1 and a2=1a_2=1 makes an=(1)na_n=(-1)^n.

Exercise 2.29

The characteristic equation is z22z4=0z^2-2z-4=0 whose roots are z1=1+5z_1=1+\sqrt5 and z2=15z_2=1-\sqrt5. Therefore, the general solution is an=c0(1+5)n+c1(15)na_n=c_0(1+\sqrt 5)^n+c_1(1-\sqrt5)^n for n>1n>1. We also have the following system of linear equations:

a0=1=c0+c1a1=2=c0(1+5)+c1(15).\begin{align*} a_0&=1=c_0+c_1 \\ a_1&=2=c_0(1+\sqrt 5)+c_1(1-\sqrt 5). \end{align*}

The constants are c0=5+125c_0=\frac{\sqrt5+1}{2\sqrt5} and c1=5125c_1=\frac{\sqrt5-1}{2\sqrt5}.

Exercise 2.30

The characteristic equation is z22z+1=0z^2-2z+1=0 whose double root is z1=1z_1=1. Therefore, the general solution is an=c0+c1na_n=c_0+c_1n for n>1n>1. We also have the following system of linear equations:

a0=0=c0a1=1=c0+c1.\begin{align*} a_0&=0=c_0\\ a_1&=1=c_0+c_1. \end{align*}

The constants are c0=0c_0=0 and c1=1c_1=1.

If we change the initial conditions, then we get another system of linear equations:

a0=1=c0a1=1=c0+c1.\begin{align*} a_0&=1=c_0\\ a_1&=1=c_0+c_1. \end{align*}

The constants are c0=1c_0=1 and c1=0c_1=0.

🌟 Exercise 2.31

We should solve the recurrence an=an1an2a_n=a_{n-1}-a_{n-2} for n>1n>1 with a0=0a_0=0 and a1=1a_1=1. The characteristic equation is z2z+1=0z^2-z+1=0 whose complex roots are z1=(1+3i)/2z_1=(1+\sqrt3i)/2 and z2=(13i)/2z_2=(1-\sqrt3i)/2. We could continue as usual, but the expression for ana_n would be awkward. Let’s use Euler’s formula to convert complex numbers into trigonometric form. The magnitudes of the roots are 1. They can be represented as e±iπ3=cos(π3)±isin(π3)e^{±i\frac{\pi}{3}}=\cos(\frac{\pi}{3})±i\sin(\frac{\pi}{3}). Thankfully to Euler’s formula exponentiation becomes trivial. The general solution is

an=c0einπ3+c1einπ3=(c0+c1)cos(nπ3)+i(c0c1)sin(nπ3)=c0cos(nπ3)+c1sin(nπ3).\begin{align*} a_n&=c_0e^{in\frac{\pi}{3}}+c_1e^{-in\frac{\pi}{3}}\\ &=(c_0+c_1)\cos\left(n\frac{\pi}{3}\right)+i(c_0-c_1)\sin\left(n\frac{\pi}{3}\right) \\ &=c'_0\cos\left(n\frac{\pi}{3}\right)+c'_1\sin\left(n\frac{\pi}{3}\right). \end{align*}

Since the sequence ana_n is real (it starts with real numbers a0,a1a_0, a_1 and has real coefficients), the constants c0c_0 and c1c_1 must be complex conjugates. Consequently, both constants c0c'_0 and c1c'_1 are also reals. We also have the following system of linear equations:

a0=0=c0cos(0)+c1sin(0)a1=1=c0cos(π3)+c1sin(π3).\begin{align*} a_0 &=0= c'_0 \cos(0) + c'_1 \sin(0) \\ a_1&=1=c'_0\cos\left(\frac{\pi}{3}\right) + c'_1 \sin\left(\frac{\pi}{3}\right). \end{align*}

The constants are c0=0c'_0=0 and c1=2/3c'_1=2/\sqrt3. Thus,

an=23sin(nπ3)     for n>1.a_n=\frac{2}{\sqrt3}sin\left(n\frac{\pi}{3}\right) \space\space\space\space\text{ for $n>1$}.

This was an amazing journey from a seemingly straightforward second-order linear recurrence to the trigonometric solution via complex algebra.

🌟 Exercise 2.32

We should solve the recurrence 2an=3an13an2+an32a_n=3a_{n-1}-3a_{n-2}+a_{n-3} for n>2n>2 with a0=0a_0=0, a1=1a_1=1 and a2=2a_2=2. The characteristic equation is 2z33z2+3z1=02z^3 - 3z^2 + 3z - 1 = 0. Let’s first search for a rational root. If we can find one, then by dividing the original polynomial with the corresponding factor, we get a quadratic equation. The latter is then directly solvable. The rational root theorem enumerates all rational candidates. It turns out that ½ is a root, thus (2z1)(2z-1) is a factor.

Using a classical polynomial division algorithm, we get that the characteristic equation in factored form is

(z2z+1)(2z1)=0.(z^2 - z + 1)(2z-1)=0.

The other two complex roots are z2,3=12±i32z_{2,3} = \frac{1}{2} ± i\frac{\sqrt{3}}{2}. Using the same transformation as in Exercise 2.31, we can write down the general solution as

an=c0(12)n+c1cos(nπ3)+c2sin(nπ3).a_n = c_0 \left(\frac{1}{2}\right)^n + c_1 \cos\left(\frac{n\pi}{3}\right) + c_2 \sin\left(\frac{n\pi}{3}\right).

We also have the following system of linear equations:

a0=0=c0+c1a1=1=12c0+12c1+32c2a2=2=14c012c1+32c2.\begin{align*} a_0 &= 0=c_0 + c_1 \\ a_1&=1 = \frac{1}{2} c_0 + \frac{1}{2} c_1 + \frac{\sqrt{3}}{2} c_2 \\ a_2&=2 = \frac{1}{4} c_0 - \frac{1}{2} c_1 + \frac{\sqrt{3}}{2} c_2. \end{align*}

The constants are c0=4/3c_0=4/3, c1=4/3c_1=-4/3 and c2=2/3c_2= 2/\sqrt{3}.

🌟 Exercise 2.33

We need to find a recurrence describing a sequence for which the order of growth decreases exponentially for odd-numbered terms, but increases exponentially for even-numbered terms. Let’s split the problem into two subproblems based on parity of n>1n>1

an={2an2if n is even,12an2if n is odd.a_n=\begin{cases} 2a_{n-2} &\text{if $n$ is even,} \\ \cfrac{1}{2}a_{n-2} &\text{if $n$ is odd}. \end{cases}

We can combine both cases into a single recurrence an=2(1)nan2a_n=2^{(-1)^n}a_{n-2} for n>1n>1. The initial conditions are a0=a1=1a_0=a_1=1.

Exercise 2.34

The sequence is known as Tribonacci numbers. The characteristic equation is z3z2z1=0z^3-z^2-z-1=0, whose single real root is the Tribonacci constant α1.839286755.\alpha \approx 1.839286755. The other two complex roots β\beta and γ\gamma have magnitudes less than one, hence they tend to zero for large NN. Therefore, the approximate general solution is

FN(3)c0αN.\boxed{F_N^{(3)} \approx c_0 \alpha^N}.

Finding the Unknown Constant c0c_0

To find the constant c0c_0 we need to use the exact general solution

FN(3)=c0αN+c1βN+c2γN.F_N^{(3)} = c_0 \alpha^N + c_1 \beta^N + c_2 \gamma^N.

We have the following system of linear equations:

F0=c0+c1+c2F1=c0α+c1β+c2γF2=c0α2+c1β2+c2γ2.\begin{align*} F_0& = c_0 + c_1 + c_2 \\ F_1 &= c_0 \alpha + c_1 \beta + c_2 \gamma \\ F_2 &= c_0 \alpha^2 + c_1 \beta^2 + c_2 \gamma^2. \end{align*}

This system can be written in matrix form:

[111αβγα2β2γ2][c0c1c2]=[F0F1F2]\begin{bmatrix} 1 & 1 & 1 \\ \alpha & \beta & \gamma \\ \alpha^2 & \beta^2 & \gamma^2 \end{bmatrix} \begin{bmatrix} c_0 \\ c_1 \\ c_2 \end{bmatrix} = \begin{bmatrix} F_0 \\ F_1 \\ F_2 \end{bmatrix}

The matrix on the left is a Vandermonde matrix. For any set of initial conditions F0,F1,F2F_0, F_1, F_2, the "standard formula" for the constant c0c_0 is

c0=F2F1(β+γ)+F0(βγ)(αβ)(αγ)=1(αβ)(αγ).c_0 = \frac{F_2 - F_1(\beta + \gamma) + F_0(\beta\gamma)}{(\alpha - \beta)(\alpha - \gamma)}=\frac{1}{(\alpha - \beta)(\alpha - \gamma)}.

The characteristic polynomial is P(z)=z3z2z1=(zα)(zβ)(zγ)P(z)=z^3-z^2-z-1=(z-\alpha)(z-\beta)(z-\gamma). Let’s take the derivative of P(z)P(z) on both sides and make them equal for α\alpha. On the left we have

P(z)=ddz[z3z2z1]=3z22z1P(α)=3α22α1\begin{align*} P'(z) &= \frac{d}{dz} [z^3-z^2-z-1] \\ &= 3z^2-2z-1 \\ &\therefore P'(\alpha)= 3{\alpha}^2 - 2{\alpha} - 1 \end{align*}

On the right we get

P(z)=ddz[(zα)(zβ)(zγ)]=(zβ)(zγ)+(zα)(zγ)+(zα)(zβ)P(α)=(αβ)(αγ).\begin{align*} P'(z) &= \frac{d}{dz} [(z-\alpha)(z-\beta)(z-\gamma)] \\ &= (z-\beta)(z-\gamma) + (z-\alpha)(z-\gamma) + (z-\alpha)(z-\beta) \\ &\therefore P'(\alpha) = (\alpha-\beta)(\alpha-\gamma). \end{align*}

Consequently, (αβ)(αγ)=3α22α1(\alpha-\beta)(\alpha-\gamma)=3{\alpha}^2 - 2{\alpha} - 1, which gives us the value of c0c_0 in terms of α\alpha

c0=13α22α1.c_0 = \frac{1}{3\alpha^2 - 2\alpha - 1}.

Estimating the Accuracy

We can calculate the exact value of F20(3)F_{20}^{(3)} using the following Python script.

a = b = 0; c = 1
n = 20

for _ in range(n - 2):
    a, b, c = b, c, a + b + c
print(c)

It outputs 35,890. The approximate formula gives F20(3)35,889.99962F_{20}^{(3)} \approx 35,889.99962, which is indeed very close to the correct result.

Exercise 2.35

Let’s define a new sequence bn=n!an    an=bnn!b_n = n! \cdot a_n \implies a_n = \frac{b_n}{n!}. If we substitute this back into the main recurrence, we get

n(n1)bnn!=(n1)bn1(n1)!+bn2(n2)!bn(n2)!=bn1(n2)!+bn2(n2)!bn=bn1+bn2(for n>1).\begin{align*}n(n - 1) \cdot \frac{b_n}{n!} &= (n - 1) \cdot \frac{b_{n - 1}}{(n - 1)!} + \frac{b_{n - 2}}{(n - 2)!} \\ \frac{b_n}{(n-2)!} &= \frac{b_{n-1}}{(n-2)!} + \frac{b_{n - 2}}{(n - 2)!} \\ &\therefore \boxed{b_n = b_{n - 1} + b_{n - 2}} && \text{(for $n>1$)}.\end{align*}

The initial conditions are b1=b0=1b_1=b_0=1, thus bn=Fn+1b_n=F_{n+1}. Therefore,

an=Fn+1n!.a_n=\frac{F_{n+1}}{n!}.

Exercise 2.36

The following Python script implements the algorithm for checking the validity of a monomial; covers both types of monomials (pure and mixed). The running time is Θ(plgp+qlgq)\Theta(p\lg p + q\lg q).

def is_valid_monomial(s, t, n):
    """
    This function checks if a given monomial, represented by two
    lists s and t, appears in the expansion of a[n], for n > 1.
    The function returns True if the monomial is valid, otherwise False.
    """
    assert(n > 1)

    # Insert sentinel values to avoid index errors.
    s.append(0)
    t.append(0)

    i = len(s) - 1
    j = len(t) - 1

    s.sort()
    t.sort()

    while n > 1:
        # Simple reduction rules:
        # 1. If the largest element is in s,
        #    then the sub-monomial comes from the previous iteration.
        # 2. If the largest element is in t,
        #    then the sub-monomial comes from two iterations back.
        # The premise is that the current monomial is valid
        # iff built from a valid sub-monomial. Easily provable by induction.
        if s[i] > t[j]:
            if s[i] != n - 1:
                return False
            i -= 1
            n -= 1
        else:
            if t[j] != n - 2:
                return False
            j -= 1
            n -= 2

    return n > 0 and i == j == 0

In the expansion of ana_n, for n>1n>1, we have Fn11(1)n2F_n-1-\frac{1-(-1)^n}{2} mixed monomials (consisted of both symbols sis_i and tjt_j). The total number of such monomials, across all iterations, is

k=2n(Fk11(1)k2)=Fn+2F3(n1)n12=Fn+2n1n12.\begin{align*} \sum_{k=2}^n\left(F_k-1-\frac{1-(-1)^k}{2}\right) &= F_{n+2}-F_3-(n-1)-\bigg\lfloor\frac{n-1}{2}\bigg\rfloor \\ &= F_{n+2}-n-1-\bigg\lfloor\frac{n-1}{2}\bigg\rfloor. \end{align*}

In the derivation, we’ve used the expression for the sum of Fibonacci numbers. For example, when n=6n=6 this formula evaluates to 12, which matches the number of mixed monomials listed in the book.

Exercise 2.37

Solving bnb_n proceeds along the same steps as in Exercise 2.29 and won’t be repeated here. The exact solution is

bn=23(1(12)n)     for n>1.b_n=\frac{2}{3}\left(1-\left(-\frac{1}{2}\right)^n\right) \space\space\space\space\text{ for $n>1$}.

Therefore,

an=223(1(12)n)     for n>1.a_n=2^{\frac{2}{3}\left(1-\left(-\frac{1}{2}\right)^n\right)}\space\space\space\space\text{ for $n>1$}.

Exercise 2.38

Let’s square both sides of the initial recurrence to get an2=1+an12a_n^2=1+a_{n-1}^2 for n>0n>0. If we introduce a substitution bn=an2b_n=a_n^2 then the recurrence transforms into bn=1+bn1b_n=1+b_{n-1}, thus bn=nb_n=n. Therefore, an=na_n=\sqrt n for n>0n>0.

Exercise 2.39

We need to solve the register allocation recurrence an=an122a_n=a_{n-1}^2-2 for n>0n>0 and various initial values.

Case 1: a0=3a_0=3

The solution is

an=(3+52)2n+(352)2n.a_n = \left(\frac{3 + \sqrt{5}}{2}\right)^{2^n} + \left(\frac{3 - \sqrt{5}}{2}\right)^{2^n}.

As the book notes, only the larger of the two roots predominates in this expression—the one with the plus sign.

Case 2: a0=4a_0=4

The solution is

an=(2+3)2n+(23)2n.a_n = (2 + \sqrt{3})^{2^n} + (2 - \sqrt{3})^{2^n}.

Again, only the larger of the two roots predominates in this expression—the one with the plus sign.

Case 3: a0=32a_0=\frac{3}{2}

This case is fundamentally different compared to the previous two. The roots b0b_0 are complex numbers

b0=12(32±7/4)=34±i74.b_0=\frac{1}{2}\left(\frac{3}{2} ± \sqrt{-7/4}\right) = \frac{3}{4} ± i\frac{\sqrt{7}}{4}.

We can use the same approach as in Exercise 2.31 to transform these into trigonometric form. The magnitude of each root is one. Let b0=e±iθb_0=e^{±i\theta}, thus the recurrence becomes

an=(eiθ)2n+(eiθ)2n=ei(2nθ)+ei(2nθ)=2cos(2nθ)(eix+eix=2cos(x) because sin(x)=sin(x)).\begin{align*} a_n &= (e^{i\theta})^{2^n} + (e^{-i\theta})^{2^n} \\ &= e^{i(2^n\theta)} + e^{-i(2^n\theta)} \\ &= 2\cos(2^n\theta) && \text{($e^{ix} + e^{-ix} = 2\cos(x)$ because $\sin(-x)=-\sin(x)$)} . \end{align*}

The sequence ana_n will therefore fluctuate chaotically between -2 and 2, never settling on a single value.

🌟 Exercise 2.40

This exercise builds upon the so-called bifurcation theory, since according to Exercise 2.39 we perceive a fundamentally different behavior around the bifurcation point a0=2a_0=2.

The accurate approximate answer for very large ϵ\epsilon and sufficiently large nn is ana02na_n \approx a_0^{2^n}. Usually, in expressions, like a0=2+ϵa_0=2+\epsilon, ϵ\epsilon is a small positive constant, so the subsequent analysis assumes this situation. The critical term is

a024=(a02)(a0+2)=ϵ(4+ϵ)2ϵ.\begin{align*} \sqrt{a_0^2 - 4} &= \sqrt{(a_0-2)(a_0+2)} \\ &= \sqrt{\epsilon(4 + \epsilon)} \\ &\approx 2\sqrt{\epsilon}. \end{align*}

For the larger root we have

b0+=12(a0+a024)12((2+ϵ)+2ϵ)=1+ϵ2+ϵ1+ϵ(ϵ1    ϵϵ).\begin{align*} b_0^+ &= \frac{1}{2}(a_0 + \sqrt{a_0^2 - 4}) \\ &\approx \frac{1}{2}( (2 + \epsilon) + 2\sqrt{\epsilon}) \\ &= 1 + \frac{\epsilon}{2} + \sqrt{\epsilon} \\ &\approx 1 + \sqrt{\epsilon} && \text{($\epsilon \ll 1 \implies \sqrt{\epsilon} \gg \epsilon$)}. \end{align*}

For the smaller root we have

b0=12(a0a024)12((2+ϵ)2ϵ)=1+ϵ2ϵ1ϵ.\begin{align*} b_0^- &= \frac{1}{2}(a_0 - \sqrt{a_0^2 - 4}) \\ &\approx \frac{1}{2}( (2 + \epsilon) - 2\sqrt{\epsilon}) \\ &= 1 + \frac{\epsilon}{2} - \sqrt{\epsilon} \\ &\approx 1 - \sqrt{\epsilon}. \end{align*}

As nn increases, (b0)2n{(b_0^-)}^{2^n} will rapidly vanish to 0. Consequently, we get an(1+ϵ)2na_n \approx (1 + \sqrt{\epsilon})^{2^n}.

All in all, this shows that even for an infinitesimally small ϵ\epsilon, the sequence ana_n will diverge to infinity at a double-exponential rate. Recall from the book that it has a nice solution when ϵ=0\epsilon=0.

🌟 Exercise 2.41

We need to find all values of the parameters α, β and γ such that an=αan12+βan1+γa_n=\alpha a_{n-1}^2+\beta a_{n-1}+\gamma reduces to bn=bn122b_n=b_{n-1}^2-2 by a linear transformation bn=f(α,β,γ)an+g(α,β,γ)b_n = f(α, β, γ)a_n + g(α, β, γ). In particular, we must also show that an=an12+1a_n=a_{n-1}^2+1 does not reduce to this form.

This exercise lacks an explicit assumption that f0f \neq 0. Otherwise, we can define g=2g=2 and all sequences ana_n would "turn" into bn=bn122b_n=b_{n-1}^2-2 using the transformation bn=0an+gb_n=0\cdot a_n+g.

We expand bnb_n to match parameters

bn=bn122=(fan1+g)22(bn1=fan1+g)=(f2an12+2fgan1+g2)2.\begin{align*} b_n &= b_{n-1}^2 - 2 \\ &= (f a_{n-1} + g)^2 - 2 && \text{($b_{n-1}=f a_{n-1} + g$)} \\ &= (f^2 a_{n-1}^2 + 2fg a_{n-1} + g^2) - 2. \end{align*}

Let’s express ana_n in terms of ff and gg starting with bn=fan+gb_n=f a_n + g

fan+g=(f2an12+2fgan1+g2)2fan=f2an12+2fgan1+(g2g2)an=(f2f)an12+(2fgf)an1+(g2g2f)=fan12+2gan1+g2g2f.\begin{align*} f a_n + g &= (f^2 a_{n-1}^2 + 2fg a_{n-1} + g^2) - 2 \\ f a_n &= f^2 a_{n-1}^2 + 2fg a_{n-1} + (g^2 - g - 2) \\ \bold{a_n} &= \left(\frac{f^2}{f}\right) a_{n-1}^2 + \left(\frac{2fg}{f}\right) a_{n-1} + \left(\frac{g^2 - g - 2}{f}\right) \\ &= \boxed{f a_{n-1}^2 + 2g a_{n-1} + \frac{g^2 - g - 2}{f}}. \end{align*}

We now match the coefficients of this result with the given recurrence an=αan12+βan1+γa_n = \alpha a_{n-1}^2 + \beta a_{n-1} + \gamma and derive the necessary and sufficient condition for transforming ana_n into bn=bn122b_n=b_{n-1}^2-2

  • α=f\alpha = f

  • β=2g    g=β/2\beta = 2g \implies g=\beta/2

  • γ=g2g2f=β2/4β/22α    4αγ=β22β8\gamma = \frac{g^2 - g - 2}{f}=\frac{\beta^2/4 - \beta/2 - 2}{\alpha} \implies \boxed{4\alpha\gamma = \beta^2 - 2\beta - 8}.

Analysis of an=an12+1a_n = a_{n-1}^2 + 1

The parameters are α=1\alpha=1, β=0\beta=0 and γ=1\gamma=1. Clearly

4αγ=4β22β8=8.4\alpha\gamma =4 \bold{\neq} \beta^2 - 2\beta - 8=-8.

This shows that an=an12+1a_n = a_{n-1}^2 + 1 does not reduce to the target form.

🌟 Exercise 2.42

We need to solve the recurrence an=2an11an12a_n=2a_{n-1} \sqrt{1-a_{n-1}^2} for n>0n>0 and different initial values. Let’s square both sides and use the substitution bn=an2b_n=a_n^2. We get

an2=4an12(1an12)    bn=4bn1(1bn1)=4bn12+4bn1.\begin{align*} a_n^2&=4a_{n-1}^2(1-a_{n-1}^2) \\ &\implies b_n=4b_{n-1}(1-b_{n-1})=\boxed{-4b_{n-1}^2+4b_{n-1}}. \end{align*}

The idea is to leverage the result from Exercise 2.41 and transform bnb_n into cn=cn122c_n=c_{n-1}^2-2. The parameters are α=4=f\alpha=-4=f, β=4=2g\beta=4=2g and γ=0\gamma=0. Since 4(4)0=422484 \cdot (-4) \cdot 0=4^2-2 \cdot 4-8, we can apply the transformation, thus cn=4bn+2c_n=-4b_n+2.

If a0=1/2a_0=1/2, then b0=1/4b_0=1/4, thus c0=1c_0=1. We already know the solution from the book for this special case, namely cn=1c_n=-1. Therefore, bn=3/4b_n=3/4 which implies that an=3/2a_n=\sqrt 3/2 for n>0n>0.

If a0=1/3a_0=1/3, then b0=1/9b_0=1/9, thus c0=14/9c_0=14/9. We hit the third case in Exercise 2.39, so the solution is cn=2cos(2nθ)c_n=2\cos(2^n\theta), where θ=arccos(7/9)\theta=\arccos(7/9). Therefore, bn=(1cos(2nθ))/2b_n=(1-\cos(2^n\theta))/2, which implies that an=(1cos(2nθ))/2a_n=\sqrt {(1-\cos(2^n\theta))/2}.

Here’s the Python script to plot a6a_6 as a function of a0a_0.

from math import sqrt
import numpy as np
import matplotlib.pyplot as plt

def a(a0):
    x = a0
    for _ in range(6):
        x = 2 * x * sqrt(1 - x * x)
    return x

def plot_a(num_points=1000, save_path="a_plot.png",
           label_fontsize=18, tick_fontsize=14, title_fontsize=20):
    xs = np.linspace(0.0, 1.0, num_points)
    ys = [a(float(x)) for x in xs]

    plt.figure(figsize=(8, 5))
    plt.plot(xs, ys, marker='o', linestyle='-', markersize=4)
    plt.grid(True)
    plt.xlabel(r"$a_0$", fontsize=label_fontsize)
    plt.ylabel(r"$a_6$", fontsize=label_fontsize)
    plt.title(r"Values of $a_6$ for $a_0$ on the interval $[0,1]$",
              fontsize=title_fontsize)
    plt.xticks(fontsize=tick_fontsize)
    plt.yticks(fontsize=tick_fontsize)
    plt.tight_layout()
    plt.savefig(save_path, dpi=150)
    plt.show()

if __name__ == "__main__":
    plot_a()
The plot shows chaotic dynamics, where the system’s state after only 6 steps is almost random and highly unpredictable; looks like a wildly oscillating wave. As a0a_0 approaches 1 the oscillations become extremely rapid.

🌟 Exercise 2.43

Our starting recurrence is a generalized "continued fraction" representation

an=αan1+βγan1+δ     for n>0.a_n=\frac{\alpha a_{n-1}+\beta}{\gamma a_{n-1}+\delta} \space\space\space\space \text{ for $n>0$}.

Let’s represent the recurrence as a function f(x)=(αx+β)/(γx+δ)f(x) = (\alpha x+\beta)/(\gamma x+\delta), where γ0\gamma \neq 0 (otherwise, we won’t have a continued fraction). This function defines a Möbius transformation. We first need to determine the fixed points denoted by rr and ss. Afterward we can leverage a change of variable to linearize the original recurrence

bn=anrans=f(an1)rf(an1)s=αan1+βγan1+δrαan1+βγan1+δs=αan1+βrγan1rδγan1+δαan1+βsγan1sδγan1+δ=αan1+βrγan1rδαan1+βsγan1sδ(assuming γan1+δ0)=αrsbn11bn1+βrγrsbn11bn1rδαrsbn11bn1+βsγrsbn11bn1sδ(an1=rsbn11bn1)=αrαsbn1+ββbn1r2γ+rγsbn1rδ+rδbn11bn1αrαsbn1+ββbn1srγ+s2γbn1sδ+sδbn11bn1=αrαsbn1+ββbn1r2γ+rγsbn1rδ+rδbn1αrαsbn1+ββbn1srγ+s2γbn1sδ+sδbn1(assuming bn11)=(sαβ+rsγ+rδ)bn1(r2γ+rδrαβ)rα+βsrγsδ+(s2γ+sδsαβ)bn1=sαβ+rsγ+rδrα+βsrγsδbn1(r and s are roots of the quadratics)=γs+δγr+δbn1(sαβ=s2γsδ and rα+β=r2γ+rδ)=(γs+δγr+δ)nb0(iterating).\begin{align*} b_n &= \frac{a_n - r}{a_n - s} \\[0.3cm] &= \frac{f(a_{n-1}) - r}{f(a_{n-1}) - s} \\[0.3cm] &= \frac{\cfrac{\alpha a_{n-1}+\beta}{\gamma a_{n-1}+\delta}-r}{\cfrac{\alpha a_{n-1}+\beta}{\gamma a_{n-1}+\delta}-s} \\[0.9cm] &= \frac{\cfrac{\alpha a_{n-1}+\beta-r\gamma a_{n-1}-r\delta}{\gamma a_{n-1}+\delta}}{\cfrac{\alpha a_{n-1}+\beta-s\gamma a_{n-1}-s\delta}{\gamma a_{n-1}+\delta}} \\[0.9cm] &= \frac{\alpha a_{n-1}+\beta-r\gamma a_{n-1}-r\delta}{\alpha a_{n-1}+\beta-s\gamma a_{n-1}-s\delta} && \text{(assuming $\gamma a_{n-1}+\delta \neq 0$)}\\[0.3cm] &= \frac{\alpha \cfrac{r-sb_{n-1}}{1-b_{n-1}}+\beta-r\gamma \cfrac{r-sb_{n-1}}{1-b_{n-1}}-r\delta}{\alpha \cfrac{r-sb_{n-1}}{1-b_{n-1}}+\beta-s\gamma \cfrac{r-sb_{n-1}}{1-b_{n-1}}-s\delta} && \text{$\left(a_{n-1}=\cfrac{r-sb_{n-1}}{1-b_{n-1}}\right)$}\\[0.9cm] &= \frac{\cfrac{\alpha r - \alpha sb_{n-1} +\beta - \beta b_{n-1} -r^2\gamma + r\gamma sb_{n-1} -r\delta +r\delta b_{n-1}}{1-b_{n-1}}} {\cfrac{\alpha r - \alpha sb_{n-1} +\beta - \beta b_{n-1} -sr\gamma + s^2\gamma b_{n-1} -s\delta +s\delta b_{n-1}}{1-b_{n-1}}} \\[0.9cm] &= \frac{\alpha r - \alpha sb_{n-1} +\beta - \beta b_{n-1} -r^2\gamma + r\gamma sb_{n-1} -r\delta +r\delta b_{n-1}} {\alpha r - \alpha sb_{n-1} +\beta - \beta b_{n-1} -sr\gamma + s^2\gamma b_{n-1} -s\delta + s\delta b_{n-1}} && \text{(assuming $b_{n-1} \neq 1$)} \\[0.3cm] &= \frac{(-s\alpha - \beta + rs\gamma +r\delta) b_{n-1} -(r^2\gamma + r\delta -r\alpha-\beta)} {r\alpha +\beta -sr\gamma -s\delta + (s^2\gamma + s\delta -s\alpha - \beta)b_{n-1}} \\[0.3cm] &= \frac{-s\alpha - \beta + rs\gamma +r\delta} {r\alpha +\beta -sr\gamma -s\delta} \cdot b_{n-1} && \text{($r$ and $s$ are roots of the quadratics)} \\[0.3cm] &= \frac{\gamma s + \delta}{\gamma r + \delta} \cdot b_{n-1} && \text{($-s\alpha - \beta=-s^2\gamma - s\delta$ and $r\alpha +\beta=r^2\gamma + r\delta$)}\\[0.3cm] &= \left(\frac{\gamma s + \delta}{\gamma r + \delta}\right)^n b_0 && \text{(iterating)}. \end{align*}

The initial value is b0=(a0r)/(a0s)=(1r)/(1s)b_0=(a_0-r)/(a_0-s)=(1-r)/(1-s). If we substitute b0b_0 into the expression for ana_n we get

an=r(1s)s(1r)(γs+δγr+δ)n(1s)(1r)(γs+δγr+δ)n     for n>0.\boxed{a_n=\frac{r(1-s) - s(1-r)\left(\cfrac{\gamma s+\delta}{\gamma r+\delta}\right)^n}{(1-s) - (1-r)\left(\cfrac{\gamma s+\delta}{\gamma r+\delta}\right)^n}} \space\space\space\space \text{ for $n>0$}.

🌟 Exercise 2.44

We need to express the recurrence

an=1sn+tnan1     for n>0,a_n = \frac{1}{s_n + t_n a_{n-1}} \space \space \space \space \text{ for $n>0$},

where sns_n and tnt_n are arbitrary sequences, as the ratio of two successive terms in a sequence defined by a linear recurrence.

Let an=bn/bn+1a_n=b_n/b_{n+1} for n0n\ge0. We need to specify a linear recurrence relation on bnb_n such that ana_n also satisfies the original non-linear recurrence. We have

an=1sn+tnan1(for n>0)bnbn+1=1sn+tnbn1bnbnbb+1=1snbn+tnbn1bnbnbn+1=bnsnbn+tnbn1bn+1=snbn+tnbn1    bn=snbn1+tnbn2.\begin{align*} a_n &= \frac{1}{s_n + t_n a_{n-1}} && \text{(for $n>0$)} \\ \cfrac{b_n}{b_{n+1}} &= \frac{1}{s_n + t_n \cfrac{b_{n-1}}{b_n}} \\ \cfrac{b_n}{b_{b+1}} &= \cfrac{1}{\cfrac{s_n b_n + t_n b_{n-1}}{b_n}} \\ \cfrac{b_n}{b_{n+1}} &= \cfrac{b_n}{s_n b_n + t_n b_{n-1}} \\ &\therefore \boxed{b_{n+1} = s_n b_n + t_n b_{n-1} \iff b_n = s_n b_{n-1} + t_n b_{n-2}}. \end{align*}

The initial conditions must satisfy

a0=b0b11=b0b1    b1=b0\begin{align*} a_0 &= \frac{b_0}{b_1} \\ 1 &= \frac{b_0}{b_1} \\ &\implies b_1=b_0 \end{align*}

We can choose any starting value for b0b_0.

🌟 Exercise 2.45

We are given a relaxed quicksort recurrence

an=f(n)+2n1jnaj1     for n>0 with a0=0.a_n=f(n)+\frac{2}{n} \sum_{1 \le j \le n} a_{j-1} \space\space\space\space \text{ for $n>0$ with $a_0=0$}.

We need to solve it when f(n)=n3f(n)=n^3. The repertoire table from the book has many reusable entries except for the n3n^3 term. If we set an=n3a_n=n^3 then an(2/n)j=1naj1=n3/2+n2n/2a_n-(2/n)\sum_{j=1}^n a_{j-1}=n^3/2+n^2-n/2. We need to combine entries from the expanded table to get f(n)=n3f(n)=n^3, thus we have

n3=2(n32+n2n2)6(n213+n1)+14(n12+Hn)+14(Hn+2)291.n^3=2 \left(\frac{n^3}{2} +n^2 - \frac{n}{2}\right) - 6 \left(\frac{n^2 - 1}{3}+ n - 1\right) + 14\left(\frac{n - 1}{2}+H_n\right) + 14(-H_n+2) - 29 \cdot 1.

Therefore, the solution is

an=2n36n(n1)+14(n+1)Hn29n     for n0.a_n=2n^3-6n(n-1)+14(n+1)H_n-29n \space\space\space\space\text{ for $n\ge0$}.

Exercise 2.46

The post on StackExchange for Mathematics explains the basics of the repertoire method and illustrates its application by solving this exercise.

🌟 Exercise 2.47

Let’s use the next Python script to calculate numerical values of ana_n, alongside our guess that an2/na_n\sim2/n. This hypothesis is based on the observation that an10a_{n-1} \to 0 as nn \to \infty.

a = 1
for n in range(1, 11):
    a = 2/(n+a)
    print(f"a({n})={a}, â({n})={2/n}")

The output shows that we’re on the right track.

a(1)=1.0, â(1)=2.0
a(2)=0.6666666666666666, â(2)=1.0
a(3)=0.5454545454545455, â(3)=0.6666666666666666
a(4)=0.43999999999999995, â(4)=0.5
a(5)=0.36764705882352944, â(5)=0.4
a(6)=0.3140877598152425, â(6)=0.3333333333333333
a(7)=0.2734449005367856, â(7)=0.2857142857142857
a(8)=0.2417372719639722, â(8)=0.25
a(9)=0.2164095278998315, â(9)=0.2222222222222222
a(10)=0.19576349152197078, â(10)=0.2

If we employ bootstrapping and substitute our guess back into the recurrence, we get

an=2(n1)n2n+2.a_n=\frac{2(n-1)}{n^2-n+2}.

The new version of the script reflects this change.

a = 1
for n in range(1, 11):
    a = 2/(n+a)
    print(f"a({n})={a}, â({n})={2*(n-1)/(n*n-n+2)}")

The results are better aligned as nn increases.

a(1)=1.0, â(1)=0.0
a(2)=0.6666666666666666, â(2)=0.5
a(3)=0.5454545454545455, â(3)=0.5
a(4)=0.43999999999999995, â(4)=0.42857142857142855
a(5)=0.36764705882352944, â(5)=0.36363636363636365
a(6)=0.3140877598152425, â(6)=0.3125
a(7)=0.2734449005367856, â(7)=0.2727272727272727
a(8)=0.2417372719639722, â(8)=0.2413793103448276
a(9)=0.2164095278998315, â(9)=0.21621621621621623
a(10)=0.19576349152197078, â(10)=0.1956521739130435

After an additional iteration we have

an=2(n23n+4)(n1)(n22n+4)     for n>1.a_n=\frac{2 (n^2 - 3 n + 4)}{(n-1)(n^2 - 2 n + 4)} \space\space\space\space \text{ for $n>1$}.

Showing only the output, we see that the approximation of ana_n is improved again.

a(2)=0.6666666666666666, â(2)=1.0
a(3)=0.5454545454545455, â(3)=0.5714285714285714
a(4)=0.43999999999999995, â(4)=0.4444444444444444
a(5)=0.36764705882352944, â(5)=0.3684210526315789
a(6)=0.3140877598152425, â(6)=0.3142857142857143
a(7)=0.2734449005367856, â(7)=0.27350427350427353
a(8)=0.2417372719639722, â(8)=0.24175824175824176
a(9)=0.2164095278998315, â(9)=0.21641791044776118
a(10)=0.19576349152197078, â(10)=0.19576719576719576

This process continues depending on a desired accuracy level.

There’s no simple closed-form solution to ana_n. The standard method to "solve" it is based on the technique from Exercise 2.44. In this case, the substitution should be an=2bn1/bna_n=2b_{n-1}/b_n for n1n\ge1, that yields bn=snbn1+2tnbn2b_n=s_nb_{n-1}+2t_nb_{n-2}, where sn=ns_n=n and tn=1t_n=1. The initial values are b0=1b_0=1 and b1=2b_1=2.

🌟 Exercise 2.48

The book assumes (without stating it explicitly) that we need to find the approximation of the average number of compares used by median-of-three quicksort. For the sake of completeness, it’s repeated here

CN=N+1+2(N3)k=1N(k1)(Nk)Ck1.C_N = N+1 + \frac{2}{\binom{N}{3}} \sum_{k=1}^{N} (k-1)(N-k) C_{k-1}.

The key is to first convert this discrete recurrence into its continuous (integral) form, which is what we need for bootstrapping when working with recurrences containing summation over a full history. The process can be broken down into three major stages.

1

Converting the Discrete Recurrence

Let’s simplify the constant multiplier of the sum for large enough NN

2(N3)=2N(N1)(N2)62N3/6=12N3.\begin{align*} \frac{2}{\binom{N}{3}} &= \frac{2}{\cfrac{N(N-1)(N-2)}{6}} \\ &\approx \frac{2}{N^3/6} \\ &= \frac{12}{N^3}. \end{align*}

Let’s set the pivot rank kk relative to the size NN; for this we use the continuous variable x=k/N    k=xN    dk=Ndxx = k/N \implies k = xN \implies dk = N dx. Finally, we treat k1kk-1 \approx k and N+1NN+1 \approx N. If we substitute these changes into our original expression and replace the summation with an integral, we have

CNN+1201(xx2)CxNdx.C_N \approx N+ 12 \int_{0}^{1} (x-x^2) C_{xN} \,dx.
2

Bootstrapping the Solution

Our initial hypothesis is CNαNlnNC_N \sim \alpha N \ln N. After all, we expect to have a familiar asymptotic form Θ(NlnN)\Theta(N \ln N) as other mergesort variants. Let’s substitute this hypothesis into our recurrence relation

CNN+1201C(xN)(xx2)dxN+1201[α(xN)ln(xN)](xx2)dx=N+12αN01(x(lnN+lnx))(xx2)dx(ln(xN)=lnN+lnx)=N+12αN01(x2lnNx3lnN+x2lnxx3lnx)dx.\begin{align*} C_N &\approx N + 12 \int_{0}^{1} C(xN) (x-x^2) \,dx \\ &\approx N + 12 \int_{0}^{1} [\alpha (xN) \ln(xN)] (x-x^2) \,dx \\ &= N + 12 \alpha N \int_{0}^{1} (x (\ln N + \ln x)) (x-x^2) \,dx && \text{($\ln(xN) = \ln N + \ln x$)} \\ &= N + 12 \alpha N \int_{0}^{1} (x^2 \ln N - x^3 \ln N + x^2 \ln x - x^3 \ln x) \,dx. \end{align*}

Let’s split this integral into the lnN\ln N part and the lnx\ln x part.

12αNlnN01(x2x3)dx=12αNlnN[x33x44]01=12αNlnN(1314)=12αNlnN(112)=αNlnN.\begin{align*} 12 \alpha N \ln N \int_{0}^{1} (x^2 - x^3) \,dx &= 12 \alpha N \ln N \left[ \frac{x^3}{3} - \frac{x^4}{4} \right]_{0}^{1} \\ &= 12 \alpha N \ln N \left( \frac{1}{3} - \frac{1}{4} \right) \\ &= 12 \alpha N \ln N \left( \frac{1}{12} \right) \\ &= \mathbf{\alpha N \ln N}. \end{align*}
N+12αN01(x2lnxx3lnx)dx=N+12αN([01x2lnxdx][01x3lnxdx])=N+12αN([1(2+1)2][1(3+1)2])=N7α12N=(17α12)N.\begin{align*} N+12 \alpha N \int_{0}^{1} (x^2 \ln x - x^3 \ln x) \,dx &= N+12 \alpha N \left( \left[ \int_{0}^{1} x^2 \ln x \,dx \right]-\left[ \int_{0}^{1} x^3 \ln x \,dx \right] \right) \\ &= N+12 \alpha N \left( \left[ -\frac{1}{(2+1)^2} \right] - \left[ -\frac{1}{(3+1)^2} \right] \right) \\ &= N- \frac{7\alpha}{12} N \\ &= \mathbf{\left(1 - \frac{7\alpha}{12}\right) N}. \end{align*}

In the second line of the above derivation, we used the identity 01xklnxdx=1(k+1)2\int_0^1 x^k \ln x \,dx = -\frac{1}{(k+1)^2}.

3

Determining the Value of α\alpha

Bootstrapping confirms the statement from the book that

CN(αNlnN)Initial+(17α12)NExpanded=αNlnN+O(N).\begin{align*} C_N &\approx \underbrace{(\alpha N \ln N)}_{\text{Initial}} + \underbrace{\left(1 - \frac{7\alpha}{12}\right) N}_{\text{Expanded}} \\ &= \alpha N \ln N + O(N). \end{align*}

If we want to retain only the leading factor, then we should choose α=12/7\alpha=12/7 to "eradicate" the O(N)O(N) summand.

🌟 Exercise 2.49

With bootstrapping we start with a rough bound and then improve it step by step. The recurrence is

an=1n0k<naknk=1n2+1n1k<naknk.a_n = \frac{1}{n} \sum_{0 \le k < n} \frac{a_k}{n-k} = \frac{1}{n^2} + \frac{1}{n} \sum_{1 \le k < n} \frac{a_k}{n-k}.

We separated the k=0k=0 term, which is a0n(n0)=1n2\frac{a_0}{n(n-0)} = \frac{1}{n^2}.

Iteration 1: From O(1/n)O(1/n) to O(lnnn2)O(\frac{\ln n}{n^2})

Let’s start with the rough guess that an=O(1/n)a_n = O(1/n) for n>0n>0. This means there exists some constant c>0c>0 such that anc/na_n \le c/n. Note that an>0a_n>0 for all n0n\ge0. If we substitute this into the recurrence, we have

an1n2+1n1k<nck(nk)=1n2+cn21k<n(1k+1nk)(1k(nk)=1n(1k+1nk))=1n2+cn2(2Hn1)1n2+2clnnn2an=O(lnnn2)=O(1/n).\begin{align*} a_n &\le \frac{1}{n^2} + \frac{1}{n} \sum_{1 \le k < n} \frac{c}{k(n-k)} \\ &= \frac{1}{n^2} + \frac{c}{n^2} \sum_{1 \le k < n} \left( \frac{1}{k} + \frac{1}{n-k} \right) && \text{$\left(\cfrac{1}{k(n-k)} = \cfrac{1}{n} \left( \cfrac{1}{k} + \cfrac{1}{n-k} \right)\right)$} \\ &= \frac{1}{n^2} + \frac{c}{n^2} (2 H_{n-1}) \\[0.4cm] &\approx \frac{1}{n^2} + \frac{2c \ln n}{n^2} \\ &\therefore \boxed{a_n=O\left(\frac{\ln n}{n^2}\right)}=O(1/n). \end{align*}

Iteration 2: From O(lnnn2)O(\frac{\ln n}{n^2}) to O(1/n2)O(1/n^2)

We assume that anclnnn2a_n \le c \frac{\ln n}{n^2} for some constant c>0c>0 and n>0n>0. If we substitute this into the recurrence, we have

an1n2+cn1k<nlnkk2(nk).a_n \le \frac{1}{n^2} + \frac{c}{n} \sum_{1 \le k < n} \frac{\ln k}{k^2 (n-k)}.

To bound the sum S=1k<nlnkk2(nk)S = \sum_{1 \le k < n} \frac{\ln k}{k^2 (n-k)}, we split it into two parts: the first half of the range and the second half. For convenience assume that nn is even.

When 1kn/21 \le k \le n/2 then nkn/2    1/(nk)2/nn-k \ge n/2 \implies 1/(n-k) \le 2/n. We have

S12n1kn/2lnkk2<2nk=1lnkk2=O(1/n).S_1 \le \frac{2}{n} \sum_{1 \le k \le n/2} \frac{\ln k}{k^2} < \frac{2}{n} \sum_{k=1}^\infty \frac{\ln k}{k^2}=O(1/n).

The infinite series k=1lnkk2\sum_{k=1}^{\infty} \frac{\ln k}{k^2} converges to a constant.

When n/2<k<nn/2 < k < n then lnkk2\frac{\ln k}{k^2} is strictly decreasing. The maximum value is at the start of the range, bounded by ln(n/2)(n/2)2=O(lnnn2)\approx \frac{\ln(n/2)}{(n/2)^2} = O(\frac{\ln n}{n^2}). Therefore,

S2O(lnnn2)n/2<k<n1nk=O(lnnn2)O(lnn)=O((lnn)2n2).S_2\le O\left(\frac{\ln n}{n^2}\right) \sum_{n/2 < k < n} \frac{1}{n-k}= O\left(\frac{\ln n}{n^2}\right) \cdot O(\ln n) = O\left( \frac{(\ln n)^2}{n^2} \right).

Thus,

an1n2+cn(O(1/n)+O((lnn)2n2))=O(1/n2)=O(lnnn2).a_n \le \frac{1}{n^2} + \frac{c}{n} \left(O(1/n)+O\left( \frac{(\ln n)^2}{n^2} \right)\right)=\boxed{O(1/n^2)}=O\left(\frac{\ln n}{n^2}\right).

Iteration 3: From O(1/n2)O(1/n^2) to O(1/n2)O(1/n^2)

To be rigorous, we should check that this form is "stable"; if we assume O(1/n2)O(1/n^2), we get O(1/n2)O(1/n^2) back. So, we take that anc/n2a_n \le c/n^2 for some constant c>0c>0 and n>0n>0. If we substitute this into the recurrence, we have

an1n2+cn1k<n1k2(nk).a_n \le \frac{1}{n^2} + \frac{c}{n} \sum_{1 \le k < n} \frac{1}{k^2(n-k)}.

We split the summation again into two parts. When 1kn/21 \le k \le n/2 then 1nk2n\frac{1}{n-k} \le \frac{2}{n}, hence

S12nk=11k2=O(1/n).S_1\le \frac{2}{n} \sum_{k=1}^\infty \frac{1}{k^2} = O(1/n).

The sum k=11k2=ζ(2)\sum_{k=1}^\infty \frac{1}{k^2}=\zeta(2), where ζ\zeta is the Riemann zeta function.

When n/2<k<nn/2 < k < n then 1k24n2\frac{1}{k^2} \le \frac{4}{n^2}. Thus, S24n21nk=O(lnnn2)S_2\le\frac{4}{n^2} \sum \frac{1}{n-k} = O(\frac{\ln n}{n^2}). Therefore,

an1n2+cn(O(1/n)+O(lnnn2))=O(1/n2).a_n \le \frac{1}{n^2} + \frac{c}{n} \cdot \left(O(1/n) +O\left(\frac{\ln n}{n^2}\right)\right)= O(1/n^2).

Final Result

From an=O(1/n2)a_n=O(1/n^2) it follows that n2an=O(1)n^2a_n=O(1).

🌟 Exercise 2.50

The recurrence is

an+1=(1+1n)an+(11n)an1     for n>0.a_{n+1} = \left(1 + \frac{1}{n}\right)a_n + \left(1 - \frac{1}{n}\right)a_{n-1} \space\space\space\space\text{ for $n>0$}.

The book states that it applies for n>1n>1, but in that case a2a_2 would be missing, since initial values are a0=0a_0=0 and a1=1a_1=1. We can view the recurrence as

an+1anan1Standard Fibonacci=1n(anan1)Perturbation Term.\underbrace{a_{n+1} - a_n - a_{n-1}}_{\text{Standard Fibonacci}} = \underbrace{\frac{1}{n}(a_n - a_{n-1})}_{\text{Perturbation Term}}.

The LHS essentially produces values of the standard Fibonacci sequence, since an+1an+an1a_{n+1} \approx a_n + a_{n-1} if we ignore perturbations. Thus, the solution to a simpler recurrence bn+1=bn+bn1b_{n+1}=b_n+b_{n-1} for n>0n>0 with b0=0b_0=0 and b1=1b_1=1 is bn=Fn=Θ(ϕn)b_n=F_n =\Theta( \phi^n), where ϕ\phi is the golden ratio.

Following the steps from the book, we need to compare the two recurrences by forming the ratio

ρn=anbn=anΘ(ϕn).\rho_n=\frac{a_n}{b_n}=\frac{a_n}{\Theta(\phi^n)}.

If we plug Θ(ϕn)ρn\Theta(\phi^n)\rho_n into the original recurrence, we won’t be able to bound ρn\rho_n. The perturbation term is approximately

1n(ϕnϕn1)1nϕn(ϕ1),\frac{1}{n}(\phi^n - \phi^{n-1}) \approx \frac{1}{n} \phi^n (\phi-1),

and the Θ(ϕn)\Theta(\phi^n) growth rate itself is not stable against the 1/n1/n perturbations in the original recurrence. In our case, this instability means the solution must grow faster. Therefore, we should add an extra nαn^\alpha multiplier and find the value of α\alpha to achieve ρn=O(1)\rho_n=O(1). Let’s use the "error" ratio

ρnannαϕn     for n>1,\rho_n\sim\frac{a_n}{n^\alpha\phi^n} \space\space\space\space \text{ for $n>1$},

with ρ1=1/ϕ\rho_1=1/\phi. We have

(n+1)αϕn+1ρn+1=(1+1n)nαϕnρn+(11n)(n1)αϕn1ρn1ϕ2(n+1)αnαρn+1=ϕ(1+1n)ρn+(11n)(n1)αnαρn1ϕ2(1+1n)αρn+1=ϕ(1+1n)ρn+(11n)(11n)αρn1ϕ2(1+1n)αρn+1[ϕ(1+1n)+(11n)(11n)α]ρn(assume ρn is non-decreasing)ϕ2(1+αn)ρn+1[ϕ(1+1n)+(11n)(1αn)]ρn(Taylor expansion (1+x)α1+αx)(ϕ2+αϕ2n)ρn+1[ϕ+ϕn+(1αn1n+O(1n2))]ρn(ϕ2+αϕ2n)ρn+1[(ϕ+1)+ϕ1αn]ρn(ignore O(1/n2)(ϕ2+αϕ2n)ρn+1[ϕ2+ϕ1αn]ρn.\begin{align*} (n+1)^\alpha \phi^{n+1} \rho_{n+1} &= \left(1 + \frac{1}{n}\right) n^\alpha \phi^n \rho_n + \left(1 - \frac{1}{n}\right) (n-1)^\alpha \phi^{n-1} \rho_{n-1} \\[0.4cm] \phi^2 \frac{(n+1)^\alpha}{n^\alpha} \rho_{n+1} &= \phi \left(1 + \frac{1}{n}\right) \rho_n + \left(1 - \frac{1}{n}\right) \frac{(n-1)^\alpha}{n^\alpha} \rho_{n-1} \\[0.4cm] \phi^2 \left(1 + \frac{1}{n}\right)^\alpha \rho_{n+1} &= \phi \left(1 + \frac{1}{n}\right) \rho_n + \left(1 - \frac{1}{n}\right) \left(1 - \frac{1}{n}\right)^\alpha \rho_{n-1} \\[0.4cm] \phi^2 \left(1 + \frac{1}{n}\right)^\alpha \rho_{n+1} &\le \left[\phi \left(1 + \frac{1}{n}\right) + \left(1 - \frac{1}{n}\right) \left(1 - \frac{1}{n}\right)^\alpha \right] \rho_n && \text{(assume $\rho_n$ is non-decreasing)} \\[0.4cm] \phi^2 \left(1 + \frac{\alpha}{n}\right) \rho_{n+1} &\le \left[\phi \left(1 + \frac{1}{n}\right) + \left(1 - \frac{1}{n}\right) \left(1 - \frac{\alpha}{n}\right)\right] \rho_n && \text{(Taylor expansion $(1 + x)^\alpha \approx 1 + \alpha x$)} \\[0.4cm] \left(\phi^2 + \frac{\alpha \phi^2}{n}\right) \rho_{n+1} &\le \left[\phi + \frac{\phi}{n} + \left(1 - \frac{\alpha}{n} - \frac{1}{n} + O\left(\frac{1}{n^2}\right)\right)\right] \rho_n \\[0.4cm] \left(\phi^2 + \frac{\alpha \phi^2}{n}\right) \rho_{n+1} &\le \left[(\phi + 1) + \frac{\phi - 1 - \alpha}{n}\right] \rho_n && \text{(ignore $O(1/n^2$)}\\[0.4cm] \left(\phi^2 + \frac{\alpha \phi^2}{n}\right) \rho_{n+1} &\le \left[\phi^2 + \frac{\phi - 1 - \alpha}{n}\right] \rho_n. \end{align*}

To solve for α\alpha we can set αϕ2=ϕ1α\alpha \phi^2 = \phi - 1 - \alpha, which implies that

α=ϕ1ϕ2+1.\alpha = \frac{\phi - 1}{\phi^2 + 1}.

Thus, the asymptotic growth of the solution is an=O(nαϕn)a_n=O(n^\alpha \phi^n).

Exercise 2.51

The coefficient nn on an1a_{n-1} suggests an=ω(n!)a_n=\omega(n!). Let’s employ a change of variable bn=an/n!b_n=a_n/n! to find a solution to a simpler recurrence

n!bn=n(n1)!bn1+n2(n2)!bn2bn=bn1+n2n(n1)bn2=bn1+nn1bn2=bn1+(1+1n1)bn2\begin{align*} n! \cdot b_n &= n(n-1)!\cdot b_{n-1} + n^2 (n-2)! \cdot b_{n-2} \\ b_n &= b_{n-1} + \frac{n^2}{n(n-1)} b_{n-2} \\ &= b_{n-1} + \frac{n}{n-1} b_{n-2} \\ &= b_{n-1} + \left(1+\frac{1}{n-1}\right) b_{n-2} \end{align*}

As nn \to \infty, 1n10\frac{1}{n-1} \to 0. Therefore, applying the perturbation method again, we end up with another simpler recurrence cn=cn1+cn2c_n = c_{n-1} + c_{n-2} for n>1n>1 with c0=0c_0=0 and c1=1c_1=1. Clearly, cn=Fnϕnc_n=F_n \sim \phi^n, where ϕ\phi is the golden ratio.

To account for the perturbation factor, we reuse the approach from Exercise 2.50. Let’s use the "error" ratio

ρnbnnαϕn     for n>1,\rho_n\sim\frac{b_n}{n^\alpha\phi^n} \space\space\space\space \text{ for $n>1$},

with ρ1=1/ϕ\rho_1=1/\phi. We have

nαϕnρn=(n1)αϕn1ρn1+nn1(n2)αϕn2ρn2ϕ2ρn=(11n)αϕρn1+nn1(12n)αρn2ϕ2ρn=(1αn)ϕρn1+nn1(12αn)ρn2(Taylor expansion (1+x)α1+αx)ϕ2ρnρn1[(1αn)ϕ+nn1(12αn)](assume ρn is non-decreasing)(11n)ϕ2ρnρn1[(11n)(1αn)ϕ+(12αn)](ϕ2ϕ2n)ρnρn1[ϕ(11nαn+O(1n2))+(12αn)](ϕ2ϕ2n)ρnρn1[ϕ(11nαn)+(12αn)](ignore O(1/n2)(ϕ2ϕ2n)ρnρn1[ϕ+11n(ϕ+ϕα+2α)](ϕ2ϕ2n)ρnρn1[ϕ21n(ϕ+ϕα+2α)].\begin{align*} n^\alpha \phi^n \rho_n &= (n-1)^\alpha \phi^{n-1} \rho_{n-1} + \frac{n}{n-1} (n-2)^\alpha \phi^{n-2} \rho_{n-2} \\[0.4cm] \phi^2 \rho_n &= \left(1 - \frac{1}{n}\right)^\alpha \phi \rho_{n-1} + \frac{n}{n-1} \left(1 - \frac{2}{n}\right)^\alpha \rho_{n-2} \\[0.4cm] \phi^2 \rho_n &= \left(1 - \frac{\alpha}{n}\right) \phi \rho_{n-1} + \frac{n}{n-1} \left(1 - \frac{2\alpha}{n}\right) \rho_{n-2} && \text{(Taylor expansion $(1 + x)^\alpha \approx 1 + \alpha x$)} \\[0.4cm] \phi^2 \rho_n &\le \rho_{n-1} \left[\left(1 - \frac{\alpha}{n}\right) \phi + \frac{n}{n-1} \left(1 - \frac{2\alpha}{n}\right)\right] && \text{(assume $\rho_n$ is non-decreasing)} \\[0.4cm] \left(1 - \frac{1}{n}\right)\phi^2 \rho_n &\le \rho_{n-1} \left[\left(1 - \frac{1}{n}\right) \left(1 - \frac{\alpha}{n}\right) \phi + \left(1 - \frac{2\alpha}{n}\right)\right] \\[0.4cm] \left(\phi^2 - \frac{\phi^2}{n}\right) \rho_n &\le \rho_{n-1} \left[\phi \left(1 - \frac{1}{n} - \frac{\alpha}{n} + O\left(\frac{1}{n^2}\right)\right) + \left(1 - \frac{2\alpha}{n}\right)\right] \\[0.4cm] \left(\phi^2 - \frac{\phi^2}{n}\right) \rho_n &\le \rho_{n-1} \left[\phi \left(1 - \frac{1}{n} - \frac{\alpha}{n}\right) + \left(1 - \frac{2\alpha}{n}\right)\right] && \text{(ignore $O(1/n^2$)} \\[0.4cm] \left(\phi^2 - \frac{\phi^2}{n}\right) \rho_n &\le \rho_{n-1} \left[\phi + 1 - \frac{1}{n}(\phi + \phi \alpha + 2\alpha)\right] \\[0.4cm] \left(\phi^2 - \frac{\phi^2}{n}\right) \rho_n &\le \rho_{n-1} \left[\phi^2 - \frac{1}{n}(\phi + \phi \alpha + 2\alpha)\right]. \end{align*}

To solve for α\alpha we can set ϕ2=ϕ+ϕα+2α\phi^2 = \phi + \phi \alpha + 2\alpha, which implies that

ϕ2ϕ=α(ϕ+2)1=α(ϕ+2)α=1ϕ+2.\begin{align*} \phi^2 - \phi &= \alpha(\phi + 2) \\ 1 &= \alpha(\phi + 2) \\ &\therefore \boxed{\alpha = \frac{1}{\phi + 2}}. \end{align*}

Thus, the asymptotic growth of the solution is an=O(n!nαϕn)a_n=O(n!n^\alpha \phi^n).

Exercise 2.52

The paper Some Doubly Exponential Sequences by Aho and Sloane presents the solution for a more general recurrence. It is also instructive to read about how variants of the generic form appear in practical problems.

The recurrence from the book, where gn=1g_n=1, generates a sequence A003095 from OEIS shifted by one (the first element should be skipped, since in our case a0=1a_0=1). Using the relationship to the constant c=1.225902443528748c=1.225902443528748\dots, our α\alpha corresponds to c2c^2 (due to the previously mentioned index shift). The digits of α\alpha are listed in A077496 from OEIS.

Exercise 2.53

This is a small variation of Exercise 2.50; the steps are exactly the same. The result is α=ϕ2ϕ2+1\alpha = -\frac{\phi^2}{\phi^2 + 1}.

The perturbation factor is (11/n)<1(1 - 1/n)<1. This means at every step, the sequence is being reduced slightly compared to the standard Fibonacci sequence. Therefore, the polynomial correction must be decay, so α\alpha must be negative.

Exercise 2.54

In this case, we always choose the smaller subinterval of size (N1)/2\lfloor(N-1)/2\rfloor (see the proof of Theorem 2.3 in the book). Therefore, the recurrence is

BN=B(N1)/2+1     for N>1 with B1=1.B_N=B_{\lfloor(N-1)/2\rfloor}+1 \space \space \space \space \text{ for $N >1$ with $B_1=1$}.

After one iteration, we get BN=B(N3)/4+2B_N=B_{\lfloor(N-3)/4\rfloor}+2 leveraging a useful property of floor and ceiling functions that x/2=x/2\lfloor\lfloor x\rfloor/2 \rfloor=\lfloor x/2 \rfloor. At the end, we must have

N(2k1)2k1    k=lg(N+1)1.\frac{N-(2^k-1)}{2^k} \ge 1 \implies k=\lfloor\lg(N+1)\rfloor-1.

The number of comparisons is BN=k+1=lg(N+1)B_N=k+1=\lfloor\lg(N+1)\rfloor for N>1N >1.

Exercise 2.55

We always choose the largest subinterval of size N/3\lceil N/3\rceil. Therefore, the recurrence is

BN=BN/3+2     for N>1 with B1=1 and B0=0.B_N=B_{\lceil N/3\rceil}+2 \space \space \space \space \text{ for $N >1$ with $B_1=1$ and $B_0=0$}.

Similarly, as in Exercise 2.54, we can characterize the number of comparisons as BN2log3NB_N\sim2\log_3 N for N>1N>1. If we convert the formula of ternary search to use logarithm of base two, then we can see that the leading term is 2/log231.26>12/\log_2 3\approx 1.26 > 1. Ternary search is less efficient than binary search in terms of comparisons. Even though ternary search reduces the problem size faster (dividing by 3 instead of 2), the "cost" of doing so (2 comparisons instead of 1) is too high to be worth it.

🌟 Exercise 2.56

The solution is part of the course material about recurrences (slides 36-37) by RS. One reservation is that zero is thought to have no bits, a point that could be argued.

Exercise 2.57

Exercise 1.5 contains an inductive proof for the RHS of an identity mentioned in this exercise. The LHS is the Theorem 2.4 from the book. Both are closed-form solutions to the same recurrence, so they must resolve to the same value for all N>0N>0.

Exercise 2.58

All subproblems below build upon the combinatorial correspondence demonstrated in Exercise 2.56.

{# 1 bits in numbers less than NN} – (NlgN)/2(N \lg N)/2

This function is specified as TN=PN(NlgN)/2T_N=P_N – (N \lg N)/2, where

PN=PN/2+PN/2+N/2     for N>1 with P1=0 (see the book).P_N=P_{\lceil N/2 \rceil}+P_{\lfloor N/2 \rfloor}+\lfloor N/2 \rfloor \space \space \space \space \text{ for $N>1$ with $P_1=0$ (see the book).}

We substitute PN=TN+(NlgN)/2P_N=T_N + (N \lg N)/2 into the above recurrence relation

TN+NlgN2=(TN/2+N/2lgN/22)+(TN/2+N/2lgN/22)+N/2TN=TN/2+TN/2+ΔN,\begin{align*} T_N + \frac{N \lg N}{2} &= \left(T_{\lceil N/2 \rceil} + \frac{\lceil N/2 \rceil \lg \lceil N/2 \rceil}{2} \right) + \left( T_{\lfloor N/2 \rfloor} + \frac{\lfloor N/2 \rfloor \lg \lfloor N/2 \rfloor}{2} \right) + \lfloor N/2 \rfloor \\[0.4cm] &\therefore \boxed{T_N= T_{\lceil N/2 \rceil} + T_{\lfloor N/2 \rfloor} + \Delta_N}, \end{align*}

where the "disturbance" term ΔN\Delta_N is

ΔN=N/2+12(N/2lgN/2+N/2lgN/2NlgN)=N/2+12(N/2(lgN/2lgN)+N/2(lgN/2lgN))(NlgN=N/2lgNN/2lgN)=N/2+12(N/2lg(N/2N)+N/2lg(N/2N))={0if N is even12(N2lg(2N/2N)+N2lg(2N/2N)1)if N is odd.\begin{align*} \Delta_N &= \lfloor N/2 \rfloor + \frac{1}{2} \left( \lceil N/2 \rceil \lg \lceil N/2 \rceil + \lfloor N/2 \rfloor \lg \lfloor N/2 \rfloor - N \lg N \right) \\[0.4cm] &= \lfloor N/2 \rfloor + \frac{1}{2} \left( \lceil N/2 \rceil (\lg \lceil N/2 \rceil - \lg N) + \lfloor N/2 \rfloor (\lg \lfloor N/2 \rfloor - \lg N) \right) && \text{($-N \lg N = -\lceil N/2 \rceil \lg N - \lfloor N/2 \rfloor \lg N$)} \\[0.4cm] &= \lfloor N/2 \rfloor + \frac{1}{2} \left( \lceil N/2 \rceil \lg \left(\frac{\lceil N/2 \rceil}{N}\right) + \lfloor N/2 \rfloor \lg \left(\frac{\lfloor N/2 \rfloor}{N}\right) \right) \\[0.6cm] &= \begin{cases} 0 &\text{if $N$ is even} \\[0.2cm] \frac{1}{2} \left( \bigg\lceil \frac{N}{2} \bigg\rceil \lg \left( \frac{2\lceil N/2 \rceil}{N} \right) + \bigg\lfloor \frac{N}{2} \bigg\rfloor \lg \left( \frac{2\lfloor N/2 \rfloor}{N} \right) - 1 \right) &\text{if $N$ is odd.} \end{cases} \end{align*}

The final simplification follows the ideas from Exercise 1.5 together with an identity lg(M/N)=lg(2M/N)1\lg(M/N) = \lg(2 \cdot M/N) - 1.

{# 0 bits in numbers less than NN} – (NlgN)/2(N \lg N)/2

Without repeating the previous derivation, we’ll just dump the final results for this and the next subproblem. The starting points is TN=ZN(NlgN)/2T_N=Z_N – (N \lg N)/2, where

ZN=ZN/2+ZN/2+N/21.Z_N = Z_{\lceil N/2 \rceil} + Z_{\lfloor N/2 \rfloor} + \lceil N/2 \rceil - 1.

TN=TN/2+TN/2+ΔNT_N= T_{\lceil N/2 \rceil} + T_{\lfloor N/2 \rfloor} + \Delta_N, where ΔN\Delta_N is

ΔN={1if N is even12(N2lg(2N/2N)+N2lg(2N/2N)1)if N is odd.\Delta_N=\begin{cases} -1 &\text{if $N$ is even} \\[0.2cm] \frac{1}{2} \left( \bigg\lceil \frac{N}{2} \bigg\rceil \lg \left( \frac{2\lceil N/2 \rceil}{N} \right) + \bigg\lfloor \frac{N}{2} \bigg\rfloor \lg \left( \frac{2\lfloor N/2 \rfloor}{N} \right) - 1 \right) &\text{if $N$ is odd.} \end{cases}

{# bits in numbers less than NN} – NlgNN \lg N

The starting point is TN=SNNlgNT_N=S_N – N \lg N. The recurrence for SNS_N is from Exercise 2.56.

TN=TN/2+TN/2+ΔNT_N= T_{\lceil N/2 \rceil} + T_{\lfloor N/2 \rfloor} + \Delta_N, where ΔN\Delta_N is

ΔN={1if N is evenN2lg(2N/2N)+N2lg(2N/2N)if N is odd.\Delta_N= \begin{cases} -1 &\text{if $N$ is even} \\[0.2cm] \bigg\lceil \frac{N}{2} \bigg\rceil \lg \left( \frac{2\lceil N/2 \rceil}{N}\right) + \bigg\lfloor \frac{N}{2} \bigg\rfloor \lg \left( \frac{2\lfloor N/2 \rfloor}{N} \right) &\text{if $N$ is odd.} \end{cases}

Exercise 2.59

We again reuse the combinatorial correspondence demonstrated in Exercise 2.56. The recurrences for RNR_N are (when N>1N>1 with R1=R0=0R_1=R_0=0)

RN=RN2lgN+R2lgN1+lgN(considering the leftmost bit)=RN/2+N/2(considering the rightmost bit).\begin{align*} R_N &= R_{N-2^{\lfloor \lg N \rfloor}}+R_{2^{\lfloor \lg N \rfloor}-1}+\lfloor \lg N \rfloor && \text{(considering the leftmost bit)} \\ &= R_{\lfloor N/2 \rfloor}+\lfloor N/2 \rfloor && \text{(considering the rightmost bit).} \end{align*}

🌟 Exercise 2.60

The Python 3 script below generates the required plot.

from functools import lru_cache
from math import floor, log2
import matplotlib.pyplot as plt

@lru_cache(maxsize=None)
def f(n):
    return f(n//2) + f((n + 1)//2) + floor(log2(n)) if n > 1 else 0

def plot(save_path="exercise_2_60.png", label_fontsize=18, tick_fontsize=14, title_fontsize=20):
    xs = list(range(1, 513))
    ys = [f(n) for n in xs]

    # Theoretical comparison curve (scaled to match roughly) 𝛩(N).
    zs = [2 * n for n in xs]

    plt.figure(figsize=(12, 9))
    plt.plot(xs, ys, label="$A_N$", linestyle='-', color='blue', linewidth=1)
    plt.plot(xs, zs, label="$\\Theta(N)$", color='red', linestyle='--', alpha=0.3)

    plt.grid(True)
    plt.legend(fontsize=16)
    plt.xlabel(r"$N$", fontsize=label_fontsize)
    plt.ylabel(r"$A_N$", fontsize=label_fontsize)
    plt.title(r"$A_N = A_{\lfloor N/2 \rfloor} + A_{\lceil N/2 \rceil} + \lfloor \lg N \rfloor$",
              fontsize=title_fontsize)
    plt.xticks(fontsize=tick_fontsize)
    plt.yticks(fontsize=tick_fontsize)
    plt.tight_layout()
    plt.savefig(save_path, dpi=150)
    plt.show()

if __name__ == "__main__":
    plot()
Apparently, the function is continuous and periodic. It runs slightly below the line y=2Ny=2N.

Exercise 2.61

The script from Exercise 2.60 needs to be tweaked a little to produce the desired plot.

This time the function is extremely "spiky." Large jumps occur just after powers of 2.

Exercise 2.62

The following Python 3 script creates the required plot and could be used to test out the variants, as mentioned in the book.

from functools import lru_cache
import matplotlib.pyplot as plt

@lru_cache(maxsize=None)
def c(n):
    return c((n + 1)//2) + c((n + 1)//2) + n if n > 1 else 0

@lru_cache(maxsize=None)
def d(n):
    return d((n + 1)//2) + d((n + 1)//2) + c(n) if n > 1 else 0

def plot(save_path="exercise_2_62.png", label_fontsize=18, tick_fontsize=14, title_fontsize=20):
    xs = list(range(1, 513))
    ys = [d(n) for n in xs]

    plt.figure(figsize=(12, 9))
    plt.plot(xs, ys, label="$D_N$", linestyle='-', color='blue', linewidth=1)

    plt.grid(True)
    plt.xlabel(r"$N$", fontsize=label_fontsize)
    plt.ylabel(r"$D_N$", fontsize=label_fontsize)
    plt.title(r"$D_N = D_{\lceil N/2 \rceil} + D_{\lceil N/2 \rceil} + C_N$, "
              r"where $C_N = C_{\lceil N/2 \rceil} + C_{\lceil N/2 \rceil} + N$",
              fontsize=title_fontsize)
    plt.xticks(fontsize=tick_fontsize)
    plt.yticks(fontsize=tick_fontsize)
    plt.tight_layout()
    plt.savefig(save_path, dpi=150)
    plt.show()

if __name__ == "__main__":
    plot()
We see a similar figure as in Exercise 2.61. If NN isn’t a power of 2, then many recursive calls happen on larger indices due to the usage of the ceil function. These effects accumulate and produce large jumps right after these border values. When only floors are used, jumps happen on transitioning to powers of 2. Balancing floors and ceils in both recurrences completely smooths out the curve.

🌟 Exercise 2.63

The function wildly oscillates and is periodic. Let m=lgNm=\lg N. Values of g(N)g(N) for N[2m,2m+1)N \in [2^m,2^{m+1}) can be generated from previously computed values in the interval [2m1,2m)[2^{m-1},2^m) by adding 2m2^m depending on parity of NN. Therefore, every subsequent period is a scaled version of its predecessor.

Exercise 2.64

We develop two equivalent recurrence relations, although they appear to be completely different. As the book hinted in Chapter 1, it’s often easier to derive the total amount TNT_N and calculate the average as TN/NT_N/N; we do include zero in the range of numbers less than NN. In both cases below, we follow this tactic.

Considering the Leftmost Bit

Drawing
The constituent parts of the recursive formula for the cumulative number of leading 1s TNT_N in numbers less than NN.

Let’s focus on the leftmost column and define m=lgNm=\lfloor \lg N \rfloor. There are 2m2^m zeros, so we should add N2mN-2^m 1s to our count. This is represented as (1) in the image. There are two major sections demarcated by a horizontal line between the last 0 and first 1. The upper region has m+T2m1m+T_{2^m-1} 1s, where (2) denotes those mm 1s. The total number of 1s in the bottom section depends whether it contains the row (3). If it doesn’t include (3), then the count is simply zero. Otherwise, it is TN2mT2m1T_{N-2^m}-T_{2^{m-1}}. After combining all these ingredients into a single formula, we get

TN=N2lgN+lgN+T2lgN1+{0if N2lgN<2lgN1,TN2lgNT2lgN1otherwise.T_N=N-2^{\lfloor \lg N \rfloor}+\lfloor \lg N \rfloor+T_{2^{{\lfloor \lg N \rfloor}}-1}+\begin{cases} 0 &\text{if } N-2^{\lfloor \lg N \rfloor} < 2^{\lfloor \lg N \rfloor-1}, \\ T_{N-2^{\lfloor \lg N \rfloor}}-T_{2^{{\lfloor \lg N \rfloor}-1}} &\text{otherwise}. \end{cases}

The base cases are T1=T0=0T_1=T_0=0.

Considering the Rightmost Bit

If NN is a power of two, then TN=RNT_N=R_N due to symmetry. Otherwise, if we focus on the rightmost bit, we see that the range can be split onto numbers ending with 1 and those ending with 0. Suppose we count how many leading 1s are there in those two subintervals. What we’re only missing is the lgN\lfloor \lg N \rfloor 1s associated with the number 2lgN1[1,N)2^{\lfloor \lg N \rfloor}-1 \in [1,N). Thus, we need to add it to the total. Therefore, the recurrence becomes TN=TN/2+TN/2+lgNT_N=T_{\lfloor N/2 \rfloor} + T_{\lceil N/2 \rceil} + \lfloor \lg N \rfloor for N>1N>1 with T1=0T_1=0. Notice, that it is exactly the one from Exercise 2.60.

🌟 Exercise 2.65

A bitstring of length NN has the form b1b2bNb_1b_2\dots b_N, where each bi0,1b_i \in {0,1}. The initial run of 1s ends at the first 0 (or continues to the end if all bits are 1). Let XX be a random variable denoting the length of the initial string of 1s in a random bitstring of length NN. The probability mass function is defined as

Pr{X=k}={(12)k12=(12)k+1if 0k<N,(12)Nif k=N.\Pr\{X=k\}=\begin{cases} \left(\cfrac{1}{2}\right)^k \cdot \cfrac{1}{2}=\left(\cfrac{1}{2}\right)^{k+1} &\text{if } 0 \le k<N, \\[0.5cm] \left(\cfrac{1}{2}\right)^N &\text{if } k=N. \end{cases}

The event X=kX=k for 0k<N0 \le k<N occurs if the first kk bits are 1 and the (k+1)(k+1)-th bit is 0. The expected value is

E[X]=k=0N1k12k+1+N12N=(1N+12N)+N2N=112N.\begin{align*} \mathbb{E}[X] &= \sum_{k=0}^{N-1} k \cdot \frac{1}{2^{k+1}} + N \cdot \frac{1}{2^N} \\ &= \left(1 - \frac{N+1}{2^N}\right) + \frac{N}{2^N} \\ &= 1 - \frac{1}{2^N}. \end{align*}

Exercise 2.66

Based on Exercise 2.65, we have that limnμ=E[X]=1\lim_{n \to \infty}\mu=\mathbb{E}[X]=1.

The variance is given by

Var(X)=E[X2]E[X]2.\text{Var}(X) = \mathbb{E}[X^2] - \mathbb{E}[X]^2.

We need to compute

E[X2]=k=0k2Pr{X=k}=k=0k22k+1=121232(11/2)3(k=0k2xk=x(1+x)(1x)3for x=1/2<1)=3.\begin{align*} \mathbb{E}[X^2] &= \sum_{k=0}^{\infty} k^2 \cdot \Pr\{X = k\} \\ &= \sum_{k=0}^{\infty} \frac{k^2}{2^{k+1}} \\ &= \frac{1}{2} \cdot \frac{\frac{1}{2} \cdot \frac{3}{2}}{(1 - 1/2)^3} && \text{$\left(\sum_{k=0}^{\infty} k^2 x^k = \cfrac{x(1 + x)}{(1 - x)^3} \quad \text{for } |x|=1/2 < 1\right)$} \\ &= 3. \end{align*}

Thus, the variance is 31=23-1=2.

Exercise 2.67

The total number of carries made when a binary counter increments NN times, from 00 to NN, matches RNR_N, as defined in the book (see also Exercise 2.59).

🌟 Exercise 2.68

In both subproblem, we follow the steps from the proof of Theorem 2.5 outlined in the book.

σ=0\sigma=0

The exact representation of the solution is

a(x)=xγ(1+(αβγ)+(αβγ)2+(αβγ)t), where t=logβx.a(x)=x^\gamma\left(1+\left(\frac{\alpha}{\beta^\gamma}\right)+\left(\frac{\alpha}{\beta^\gamma}\right)^2+\dots \left(\frac{\alpha}{\beta^\gamma}\right)^t\right) \text{, where }t=\lfloor \log_\beta x\rfloor.

If α<βγ    γ>logβα\alpha<\beta^\gamma \implies \gamma > \log_\beta \alpha then the sum converges and

a(x)xγj0(αβγ)j=βγβγαc1xγ.a(x) \sim x^\gamma \sum_{j \ge 0} \left(\frac{\alpha}{\beta^\gamma}\right)^j=\underbrace{\frac{\beta^\gamma}{\beta^\gamma-\alpha}}_{c_1}x^\gamma.

If α=βγ    γ=logβα\alpha=\beta^\gamma \implies \gamma =\log_\beta \alpha then each of the terms in the sum is 1 and the solution is simply

a(x)xγlogβx1c2xγlogx.a(x) \sim x^\gamma \log_\beta x\sim\underbrace{1}_{c_2}\cdot x^\gamma \log x.

σ0\sigma \neq0

The exact representation of the solution is

a(x)=xγ((lgx)σ+(αβγ)(lgxβ)σ+(αβγ)2(lgxβ2)σ+(αβγ)t(lgxβt)σ), where t=logβx.a(x)=x^\gamma\left((\lg x)^\sigma+\left(\frac{\alpha}{\beta^\gamma}\right)\left(\lg \frac{x}{\beta}\right)^\sigma+\left(\frac{\alpha}{\beta^\gamma}\right)^2\left(\lg \frac{x}{\beta^2}\right)^\sigma+\dots \left(\frac{\alpha}{\beta^\gamma}\right)^t\left(\lg \frac{x}{\beta^t}\right)^\sigma\right) \text{, where }t=\lfloor \log_\beta x\rfloor.

Note that

(lgxβk)σ=(lgxklgβ)σ=(lgx)σ(1klgβlgx)σ.\begin{align*} \left(\lg \frac{x}{\beta^k}\right)^\sigma &= (\lg x - k\lg \beta)^\sigma \\ &= (\lg x)^\sigma\left(1-\frac{k\lg \beta}{\lg x}\right)^\sigma. \end{align*}

As x    lgxx \to \infty \implies \lg x \to \infty, for fixed k

(1klgβlgx)σ1.\left(1-\frac{k\lg \beta}{\lg x}\right)^\sigma \to 1.

Furthermore, because the series converges rapidly, since (α/βγ)<1(\alpha/\beta^\gamma)<1, the contribution of terms where kk is close to tt is negligible.

If α<βγ    γ>logβα\alpha<\beta^\gamma \implies \gamma > \log_\beta \alpha then the sum converges and

a(x)xγ(lgx)σj0(αβγ)j=βγβγαc1xγ(lgx)σ.a(x) \sim x^\gamma (\lg x)^\sigma\sum_{j \ge 0} \left(\frac{\alpha}{\beta^\gamma}\right)^j=\underbrace{\frac{\beta^\gamma}{\beta^\gamma-\alpha}}_{c_1}x^\gamma(\lg x)^\sigma.

If α=βγ    γ=logβα\alpha=\beta^\gamma \implies \gamma =\log_\beta \alpha then each of the terms in the sum is 1 and the solution is simply

a(x)xγlogβx(lgx)σ1c2xγ(logx)σ+1.a(x) \sim x^\gamma \log_\beta x (\lg x)^\sigma\sim\underbrace{1}_{c_2} \cdot x^\gamma (\log x)^{\sigma+1}.

🌟 Exercise 2.69

The recurrence falls under case 2 of the Master theorem. The exact solution takes the form

aN=Nlog3N+NP(N).a_N = N \log_3 N + N \cdot P(N).

The function PP represents the fluctuations caused by the floor function N/3\lfloor N/3 \rfloor. We have

P(N)=aNNlog3N     for N>0.P(N)=\frac{a_N}{N}-\log_3 N \space \space \space \space \text{ for $N>0$}.
The plot of the periodic part shows a fractal image, where jumps occur at values of N=43kN=4 \cdot 3^k for k>0k>0.

Exercise 2.70

Here is the Python script for trying out different configurations. Currently, it is setup to draw a variant where floors are replaced by ceils.

from functools import lru_cache
from math import log
import matplotlib.pyplot as plt

@lru_cache(maxsize=None)
def a(n):
    return 3 * a((n+1)//3) + n if n > 3 else 1

def plot(save_path="exercise_2_70.png", label_fontsize=18, tick_fontsize=14, title_fontsize=20):
    xs = list(range(1, 973))
    ys = [a(n) / n - log(n, 3) for n in xs]

    plt.figure(figsize=(12, 9))
    plt.plot(xs, ys, linestyle='-', color='blue', linewidth=1)

    plt.grid(True, which='both', linestyle='--', alpha=0.4)
    plt.xlabel(r"$N$", fontsize=label_fontsize)
    plt.ylabel(r"$\frac{a_N}{N} - \log_3 N$", fontsize=label_fontsize)
    plt.title(r"Periodic Part of $a_N = 3a_{\lceil N/3 \rceil} + N$",
              fontsize=title_fontsize)
    plt.xticks(fontsize=tick_fontsize)
    plt.yticks(fontsize=tick_fontsize)
    plt.tight_layout()
    plt.savefig(save_path, dpi=150)

    plt.show()

if __name__ == "__main__":
    plot()
This is the plot where 2 floors are replaced by ceils. Apparently, the jumps aren’t so steep as in Exercise 2.69.

Exercise 2.71

The exact solution takes the form

a(x)=2x+α2x/β+α22x/β2++αt2x/βt=k=0tαk2x/βk=2x(1+k=1tαk2x1βkβk), where t=logβx.\begin{align*} a(x) &= 2^x + \alpha 2^{x/\beta} + \alpha^2 2^{x/\beta^2} + \cdots + \alpha^t 2^{x/\beta^t} \\ &= \sum_{k=0}^t \alpha^k 2^{x/\beta^k} \\ &= 2^x\left(1 + \sum_{k=1}^t \alpha^k 2^{x \cdot \frac{1-\beta^k}{\beta^k}}\right) \text{, where }t=\lfloor \log_\beta x\rfloor. \end{align*}

Since β>1\beta>1, we have 1βkβk<0\frac{1-\beta^k}{\beta^k}<0 for k1k \ge 1. The 2cx2^{c \cdot x} term grows faster than any polynomial αx\alpha^x, hence ax/2cx0a^x/2^{c \cdot x} \to 0 as xx \to \infty, for any positive constant cc.

Therefore, we can conclude that a(x)=Θ(2x)a(x) = \Theta(2^x).

Exercise 2.72

There are many ways to give an asymptotic solution to the recurrence. One approach is to use the more general Akra-Bazzi method. Another one is to draw a recursion tree and discover patterns. There’s a superb tool called VisuAlgo to see recursive trees in action. It has lots of prebuilt recurrences, but it’s also possible to define a custom recurrence by selecting the Custom Code option from the drop-down list. The following snippet contains the definition for this exercise.

if (n <= 3) /* base case */
  return 1;
else /* recursive case */
  return f(3*n/4) + f(n/4) + n;

After entering a value for NN and pressing the Run button, the tool visualizes the whole tree. In general, the cost per level is NN. The largest subproblem size decreases by a factor of 4/3 at each step, so the depth is log4/3N\sim \log_{4/3} N. Therefore, the recurrence satisfies aN=Θ(NlgN)a_N = \Theta(N \lg N).

If we want to be rigorous, then we must prove the hypothesis formulated using a recursion tree, for example, using induction on NN.

Exercise 2.73

The same remark applies here as in Exercise 2.72. By drawing a recursion tree, we can see that the depth is lgN\sim \lg N and the amount of work at each step drops by a factor of 3/4. In other words, the cost at level kk is (3/4)kN(3/4)^kN. After summing up costs, we can conclude that the recurrence satisfies aN=Θ(N)a_N=\Theta(N).

🌟 Exercise 2.74

We assume that all index functions are positive. We prove the lower and upper bounds separately.

Suppose an=O(n)    anc2nd2a_n=O(n) \implies a_n \le c_2n-d_2 for some positive constants c2c_2 and d2d_2 as well as sufficiently large nn. Substituting this hypothesis into our recurrence, we get

an=af(n)+ag(n)+ah(n)+1c2(f(n)+g(n)+h(n))3d2+1(by the inductive hypothesis)=c2nd2(2d21)c2nd2(if we make 2d210    d21/2)an=O(n)    \begin{align*} a_n &= a_{f(n)}+a_{g(n)}+a_{h(n)}+1 \\ &\le c_2(f(n)+g(n)+h(n))-3d_2+1 && \text{(by the inductive hypothesis)} \\ &= c_2n-d_2-(2d_2-1) \\ &\le c_2n-d_2 && \text{(if we make $2d_2-1 \ge 0 \implies d_2 \ge 1/2$)} \\ &\therefore a_n=O(n) \space\space\space\space \checkmark \end{align*}

The lower bound is even easier. Suppose an=Ω(n)    anc1na_n=\Omega(n) \implies a_n \ge c_1n for some positive constant c1c_1 and sufficiently large nn. Substituting this hypothesis into our recurrence, we get

an=af(n)+ag(n)+ah(n)+1c1(f(n)+g(n)+h(n))+1(by the inductive hypothesis)=c1n+1c1nan=Ω(n)    \begin{align*} a_n &= a_{f(n)}+a_{g(n)}+a_{h(n)}+1 \\ &\ge c_1(f(n)+g(n)+h(n))+1 && \text{(by the inductive hypothesis)} \\ &= c_1n+1 \\ &\ge c_1n \\ &\therefore a_n=\Omega(n) \space\space\space\space \checkmark \end{align*}

Since both bounds are the same, we can conclude that an=Θ(n)a_n=\Theta(n). Observe that we can choose suitable values for our constants to satisfy the base cases, too. For example, c1=1/t,c2=3/2c_1=1/t, c_2=3/2 and d2=1/2d_2=1/2 configuration works.

🌟 Exercise 2.75

The recurrence ana_n is the standard formula for counting the number of nodes (internal nodes + leaves) in a binary tree with a "coarsening effect" (subtrees of size t\le t are counted as 1).

  • The +1+1 represents the current node.

  • af(n)a_{f(n)} is the number of nodes in the left child's subtree.

  • ag(n)a_{g(n)} is the number of nodes in the right child's subtree.

Since f(n)1f(n) \ge 1 and g(n)1g(n) \ge 1 all non-leaf nodes have exactly two children, which is the definition of a full binary tree. A key property of non-empty full binary trees is that the number of leaf nodes is one more than the number of internal nodes. Consequently, the number of internal nodes is roughly an/2a_n/2.

We are also given the condition f(n)+g(n)=nh(n)f(n) + g(n) = n - h(n). This equation tells us how the "size" of the problem decreases at each step. At any internal node with size nn, the combined size of its children is nh(n)n-h(n). This means that at every step of the recursion, a size drops by h(n)h(n). Since the recursion starts with nn, and ends when reaching the leaves, the sum of reductions at every internal node must be approximately equal to the initial size nn (minus the small residuals at the leaves). This can be expressed as

nall internal nodes ih(sizei)i=1anh(sizei).n \approx \sum_{\text{all internal nodes } i} h(\text{size}_i) \approx \sum_{i=1}^{a_n} h(\text{size}_i).

Let’s assume that h(n)=O(1)    h(n)<=ch(n)=O(1) \implies h(n) <=c for some positive constant cc and sufficiently large nn. Substituting this into the above equation, we get

ni=1anc=can    an/n>0 as n.n \le \sum_{i=1}^{a_n} c=ca_n \implies a_n/n >0 \text{ as $n \to \infty$}.

We must have h(n)=ω(1)h(n)=\omega(1). Thus, in the summation, the terms h(sizei)h(\text{size}_i) become larger on average as nn grows. Roughly speaking nanhaveragen \approx a_n \cdot h_{\text{average}}, which implies

ann1haverage0.\frac{a_n}{n} \approx \frac{1}{h_{\text{average}}} \to 0.

So, we just need to ensure that h(n)h(n) grows unboundedly (arbitrarily slowly) as nn \to \infty.

Last updated