# Appendix D

## Exercise D.1-1

For all $$i=1,2,\dots,n$$ and $$j=1,2,\dots,n$$ we have

$$
\begin{align\*}
c\_{ij} &= a\_{ij}+b\_{ij} \\
&= a\_{ji}+b\_{ji} \\
&= c\_{ji} \\\\
d\_{ij} &= a\_{ij}-b\_{ij} \\
&= a\_{ji}-b\_{ji} \\
&= d\_{ji}
\end{align\*}
$$

so $$C=A+B$$ and $$D=A-B$$ are also symmetric matrices.

## Exercise D.1-2

Let $$A$$ be a matrix of size $$p \times q$$ and $$B$$ a matrix of size $$q \times r$$.We have

$$
\begin{align\*}
c\_{ji}^T =c\_{ij}&= \sum\_{k=1}^qa\_{ik}b\_{kj} && \text{(} C^T=(AB)^T \text{)} \\
&= \sum\_{k=1}^q{b\_{jk}^T}{a\_{ki}^T} && \text{(} C^T=(B^TA^T) \text{)}
\end{align\*}
$$

Also, $$(A^TA)^T=A^T(A^T)^T=A^TA$$, so $$A^TA$$ is always a symmetric matrix.

## Exercise D.1-3

Let $$A$$ and $$B$$ be $$n \times n$$ lower-triangular matrices and  $$C=AB$$. For all $$1 \le i < j \le n$$ we have

$$
\begin{align\*}\
c\_{ij} &= \sum\_{k=1}^na\_{ik}b\_{kj} \\
&= \sum\_{k=1}^{j-1}a\_{ik}b\_{kj}+\sum\_{k=j}^na\_{ik}b\_{kj} \\
&= \sum\_{k=1}^{j-1}a\_{ik}\cdot0+\sum\_{k=j}^n0\cdot b\_{kj}  \\
&= 0
\end{align\*}
$$

Therefore, the matrix $$C$$ is also lower-triangular.

## Exercise D.1-4

As explained in the book, multiplying a vector $$x$$ by a permutation matrix has the effect of permuting (rearranging) the elements of $$x$$. We can represent a matrix as a sequence of column vectors $$A=\braket{c\_1,c\_2,\dots,c\_n}$$. Thus, $$PA=\braket{Pc\_1,Pc\_2,\dots,Pc\_n}$$ is a matrix where each column vector of $$A$$ is permuted in the same way. In other words, rows of $$A$$ are permuted. Analogously, we can view a matrix as a sequence of row vectors $$A=\braket{r\_1^T,r\_2^T,\dots,r\_n^T}$$, hence $$AP=\braket{r\_1^TP,r\_2^TP,\dots,r\_n^TP}$$ is a matrix where each row vector of $$A$$ is permuted in the same way. In other words, columns of $$A$$ are permuted.

The previous explanation proves the statement from the book at high level. Let us see where elements of $$A$$ land in $$PA$$. Row $$i$$ of $$P$$ contains only a single 1 at index $$q$$, so $$p\_{iq}=1$$ and 0s elsewhere. We have

$$
\begin{align\*}\
a\_{ij} &= \sum\_{k=1}^np\_{ik}a\_{kj} \\\
&= p\_{iq}a\_{qj} \\\
&= a\_{qj}
\end{align\*}
$$

Notice that all elements from row $$q$$ of $$A$$ goes into row $$i$$. In other words, rows of $$A$$ are permuted. This works since each distinct row of $$P$$ has a unique column index such that $$P$$ has 1 there. A similar observation applies to $$AP$$ just that instead of permuting rows we permute columns.

Apparently, if we replace $$A$$ with $$P$$, then $$PP$$ is another permutation matrix, as permuting columns or rows of $$P$$ still maintains the property that it will have exactly one 1 in each row or column, and 0s elsewhere.

## Exercise D.2-1

Suppose, for contradiction, that $$B \neq C$$. We have

$$
B=BI=B(AC)=(BA)C=IC=C
$$

which contradicts our initial assumption of distinct inverses. Therefore, matrix inverses are unique.

## Exercise D.2-2

Let $$A$$ be an $$n \times n$$ lower-triangular matrix. We prove by induction on $$n$$ that $$\det (A)$$ is equal to the product of its diagonal elements. The base case holds; when $$n=1$$ then $$\det(A)=a\_{11}$$. For $$n >1$$ we have

$$
\begin{align\*}
\det(A) &= \sum\_{j=1}^n(-1)^{1+j}a\_{1j}\det(A\_{\[1j]}) \\
&= (-1)^{1+1}a\_{11}\det(A\_{\[11]}) \\
&= a\_{11}\prod\_{i=2}^na\_{ii} && \text{(by the inductive hypothesis)} \\
&= \prod\_{i=1}^na\_{ii}
\end{align\*}
$$

According to the principles of mathematical induction, we can conclude that the statement holds for all $$n \ge 1$$. For an upper-triangular matrix the poof is nearly identical with a difference that we start at $$a\_{nn}$$ and work backwards.

Suppose, for contradiction, that $$B=A^{-1}$$ is not lower-triangular, thus ith row of $$B$$  has additional nonzero entries $$b\_{i(i+1)},b\_{i(i+2)},\dots,b\_{ij}$$ for  $$i < j \le n$$. We know that $$BA=I$$ and the ith row of $$I$$ has only one nonzero element at column $$i$$. Since $$A$$ is lower-triangular, the first nonzero element in the jth column is at row $$j$$, hence $$e\_{ij}=b\_{ij}a\_{jj}=0 \implies b\_{ij}=0$$;  $$e\_{ij}=0$$, where $$i \neq j$$, and $$b\_{ik}=0$$ for all $$j \<k \le n$$. But this entails a cascading effect $$b\_{i(j-1)}a\_{(j-1)(j-1)}+0 \cdot a\_{j(j-1)}=0 \implies b\_{i(j-1)}=0$$ and so on. It turns out that $$b\_{ij}=0$$ for all $$j>i$$, which contradicts our statement that $$B$$ is not lower-triangular. Observe that all diagonal elements of $$A$$ must be nonzero, otherwise $$A$$ would be singular.

## Exercise D.2-3

Wikipedia has a proof [here](https://en.wikipedia.org/wiki/Permutation_matrix#The_transpose_is_also_the_inverse) that $$P$$ is invertible and $$P^{-1}=P^T$$.&#x20;

$$P^T$$ maintains the property that it has exactly one 1 in each row or column, and 0s elsewhere, since transposing a matrix swaps its columns and rows.  Therefore, $$P^T$$ is also a permutation matrix.

## Exercise D.2-4

We know that $$a\_i^T \cdot b\_j=0$$, where $$i \neq j$$, and $$a\_i^T \cdot b\_i=1$$; $$a\_*^T$$ is a row vector of $$A$$ and $$b\_*$$ is a column vector of $$B$$. All row vectors of $$A$$, except row $$i$$, are the same in $$A'$$. All column vectors of $$B$$, except column $$j$$, are the same in $$B'$$. It is stated that $$a\_i'^T=a\_i^T+a\_j^T$$ and $$b\_j'=b\_j-b\_i$$ where $$i \neq j$$. $$A'B'=I$$ since for all distinct $$i$$, $$j$$ and $$k$$ we still have

$$
\begin{align\*}
a\_i'^T \cdot b\_j' &= a\_i ^T\cdot b\_j - a\_i^T \cdot b\_i + a\_j ^T\cdot b\_j - a\_j^T \cdot b\_i=0-1+1-0=0 \\
a\_k^T \cdot b\_j' &= a\_k^T \cdot b\_j - a\_k^T \cdot b\_i=0-0=0 \\
a\_i'^T \cdot b\_k &= a\_i^T \cdot b\_k + a\_j^T \cdot b\_k=0+0=0
\end{align\*}
$$

On the other hand, $$a\_i'^T\cdot b\_i=a\_j^T \cdot b\_j'=a\_k^T \cdot b\_k=1$$, hence $$A'B'$$ has ones on the main diagonal and 0s elsewhere.

## Exercise D.2-5

We only prove one direction, as the opposite is symmetrical due to $$(A^{-1})^{-1}=A$$. Assume every entry of $$A$$ is real. Let its inverse $$B=A^{-1}$$ be represented as a sequence of complex column vectors $$b\_j=x\_j+iy\_j$$ for $$j=1,2,\dots,n$$. Here, real $$n$$-vectors $$x\_j$$ and $$y\_j$$ denote the real and imaginary part of $$b\_j$$, respectively. We know that $$AB=I$$, hence $$Ab\_j=Ax\_j+iAy\_j=e\_j$$ is a real $$n$$-vector with single 1 at index $$j$$ and 0s elsewhere. Consequently, $$Ay\_j=0 \implies y\_j=0$$ based on Theorem D.1 and Theorem D.2. The previous reasoning applies for all $$j=1,2,\dots,n$$, thus $$B$$ is also a real matrix.

## Exercise D.2-6

$$
\begin{align\*}
(A^{-1})^T&= (A^T)^{-1} = A^{-1} \ \\
(BAB^T)^T &= ((BA)B^T)^T  \\
&= (B^T)^T(BA)^T \\
&= BA^TB^T \\
&= BAB^T
\end{align\*}
$$

## Exercise D.2-7

Let $$A=\braket{a\_1,a\_2,\dots,a\_n}$$ be a matrix represented as a sequence of column vectors and let $$x$$ be an $$n$$-vector, thus $$Ax=\sum\_{i=1}^n a\_ix\_i$$. If $$A$$ has a full column rank, then all column vectors are linearly independent. Therefore, $$Ax=0 \implies x=0$$. In the opposite direction, assume, for contradiction, that the column rank of $$A$$ is less than $$n$$ even though $$A$$ does not have a null vector. Consequently, the column vectors are linearly dependent, so there is some $$x \neq 0$$ such that $$Ax=0$$. But this contradicts the initial assumption of $$Ax=0 \implies x=0$$. This concludes the proof of Theorem D.2.

## Exercise D.2-8

Let $$A\_{m \times p}$$ and $$B\_{p \times n}$$ be matrices whose ranks are $$r\_A$$ and $$r\_B$$, respectively. According to the alternate definition of the rank of a matrix, we have that $$A=C'*{m \times r\_A}D'*{r\_A \times p}$$ and $$B=C''*{p \times r\_B}D''*{r\_B \times n}$$. Let $$(AB)*{m \times n}$$ be a matrix whose rank is $$r$$, so $$AB=C*{m \times r}D\_{r \times n}$$. We can express $$AB=C'D'C''D''$$ in two different ways as follows:

* $$AB=(C'D'C'')*{m \times r\_B}D''*{r\_B \times n} \implies r \le r\_B$$
* $$AB=C\_{m \times r\_A}'(D'C''D'')\_{r\_A \times n} \implies r \le r\_A$$

Therefore, we can conclude that $$r \le \min{r\_A,r\_B}$$.&#x20;

Now, assume that $$A\_{m \times m}$$ is a nonsingular matrix, so $$r\_A=m$$. If $$B\_{m \times m}$$ is also a nonsingular matrix then obviously $$r\_B=m$$ and $$r=m$$, so equality holds. Thus, let $$B\_{m \times n}$$ be a matrix whose rank is $$r\_B=\min{m,n} \le r\_A$$. We have

$$
B = A^{-1}AB = (A^{-1}C)*{m \times r}D*{r \times n} \implies r\_B \le r
$$

Consequently, $$(r \le r\_B \land r \ge r\_B) \implies r= r\_B=\min{r\_A,r\_B}$$. The mirror case, where only B is nonsingular, is proven analogously.

## Problem D-1

We prove the statement from the book using induction on $$n$$. The base case $$n=1$$ trivially holds, as $$V(x\_0)=\prod\_{0\le j\<k\le0}(x\_k-x\_j)=1$$.&#x20;

For $$n>1$$ we rely on Theorem D.4, while applying the hint from the book, which says that a determinant is unchanged if entries in one row/column are added to those in another row/column. This can be extended to the case where entries are previously scaled by some factor $$\lambda \neq 0$$. Theorem D.4 also states that a determinant is multiplied by such factor if all entries in some row/column are multiplied by it. So, we can simply multiply all entries by $$\lambda$$, add such scaled entries to another row/column, and divide the previously multiplied elements by $$\lambda$$.

We transform the Vandermonde matrix $$V$$, by following the hint, to get

$$
V'=\begin{pmatrix}
1 & 0 & 0 & 0 & \cdots & 0 \\\[2mm]
1 & x\_1-x\_0 & x\_1(x\_1-x\_0) & x\_1^2(x\_1-x\_0) & \cdots & x\_1^{n-2}(x\_1-x\_0) \\\[2mm]
1 & x\_2-x\_0 & x\_2(x\_2-x\_0) & x\_2^2(x\_2-x\_0) & \cdots & x\_2^{n-2}(x\_2-x\_0) \\\[2mm]
\vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\\[2mm]
1 & x\_{n-1}-x\_0 & x\_{n-1}(x\_{n-1}-x\_0) & x\_{n-1}^2(x\_{n-1}-x\_0) & \cdots & x\_{n-1}^{n-2}(x\_{n-1}-x\_0)
\end{pmatrix}
$$

Each row is scaled by the factor $$x\_i-x\_0$$ for $$i=1,2,\dots,n-1$$, so we have

$$
\begin{align\*}
\det(V(x\_1,x\_2,\dots,x\_{n-1})) &= \det(V') \\
&= \det(V'*{\[11]}) \\
&= \prod*{i=1}^{n-1}(x\_i-x\_0)\det(V(x\_1,x\_2,\dots,x\_{n-1})) \\
&= \prod\_{i=1}^{n-1}(x\_i-x\_0)!!!!!\prod\_{1\le j\<k\le n-1}!!!!!!(x\_k-x\_j) && \text{(by the inductive hypothesis)}\\
&= !!!!!!!!\prod\_{0\le j\<k\le n-1}!!!!!!(x\_k-x\_j)
\end{align\*}
$$

According to the principles of mathematical induction, we can conclude that the statement from the book holds for all $$n \ge 1$$.

## Problem D-2

### a.

Let $$A$$ be a matrix represented as a sequence of column vectors $$a\_i$$ for $$i=0,1,\dots,n-1$$. WLOG assume that the first $$r$$ column vectors are linearly independent, otherwise we would explicitly work with a set of $$r$$ indices $$J \subseteq{0,1,\dots,n-1}$$ that denote those linearly independent objects. We prove the upper and lower bound separately.

For any $$n$$-bit vector $$x$$ we can always find a matching $$n$$-bit vector $$y$$, with 0s on the last $$n-r$$ positions, such that $$Ax=Ay$$. This is attainable, as the first $$r$$ linearly independent column vectors of $$A$$ can represent all possible combinations of the last $$n-r$$ dependent vectors. There are at most $$2^r$$ ways of choosing $$y$$ for all possible inputs $$x$$, hence $$|R(A)| \le 2^r$$.

We claim that any unique combination of the first $$r$$ bits of an $$n$$-bit vector $$x$$ produces a distinct image $$Ax$$. Suppose, for contradiction, that there is another $$n$$-bit vector $$y$$ such that $$Ax=Ay$$. Assume that both $$x$$ and $$y$$ has 0s on the last $$n-r$$ bit positions. Therefore, $$A(x-y)=0$$, so $$x-y \neq 0$$ is a null vector of $$A$$. But this contradicts the assumption that $$A$$ has rank $$r$$, since it turns out that neither the first $$r$$ column vectors are linearly independent (see also [Exercise D.2-7](#exercise-d.2-7)). Thus, $$|R(A)| \ge 2^r$$.

$$(|R(A)| \le 2^r \land |R(A)| \ge 2^r) \implies |R(A)| =2^r$$ and $$r\<n \implies |R(A)| < 2^n=|S\_n|$$. Therefore, $$A$$ must have a full rank to define a permutation on $$S\_n$$.

### b.

Suppose, for contradiction, there is some $$y$$ such that $$|P(A,y)|>2^{n−r}$$. By the pigeonhole principle there would be at least one duplicated segment of $$n-r$$ bits linked to two different segments of $$r$$ bits  associated with linearly independent columns of $$A$$. But this would mean that $$|R(A)|<2^r$$, which is a contradiction (see part (a)). Therefore, $$|P(A,y)|\le2^{n−r}$$. On the other hand, $$|P(A,y)|\ge2^{n−r}$$ since no $$y$$ could have preimage of size less then $$2^{n-r}$$ without some other having more, which we have just shown is impossible. We know that $$|S\_n|=2^n$$ and everything must add up. We can conclude that $$|P(A,y)|=2^{n−r}$$.

### c.

At first glance, this task seems complicated, but it is essentially a corollary of parts (a) and (b). Let $$A$$ be a matrix represented as a sequence of column vectors $$a\_i$$ for $$i=0,1,\dots,n-1$$ as before. An image of $$x \in S$$ is a linear combination of these vectors $$y=Ax=\sum\_{i=0}^{n-1} a\_ix\_i$$. The identifier of the block where $$y$$ belongs spans its last $$n-m$$ bit positions, while the first $$m$$ bits specify the offset. Recall that every block is of size $$2^m$$. This mirrors the structure of each column vector of $$A$$, hence only the last $$n-m$$ rows of $$A$$ determine the block identifier. Inputs $$x$$ from separate blocks differ by a constant factor $$i2^m$$, so their images $$y$$ will be shifted by the same amount. WLOG we may assume that our source block $$S$$ is block 0. Consequently, all input $$n$$-bit vectors $$x$$ have 0s in the last $$n-m$$ positions, thus only the leftmost $$m$$ columns of $$A$$ are relevant. Evidently, all the remaining columns of $$A$$ would be multiplied by 0s during the matrix-vector multiplication $$Ax$$.

A fundamental property of any matrix $$A$$ is that its row rank always equals its column rank. The fact that $$r$$ is the rank of the lower-left $$(n-m) \times m$$ submatrix of $$A$$ implies that there are $$r$$ linearly independent column vectors among the leftmost $$m$$ columns of $$A$$. Every distinct combination of these vectors produces a unique block identifier. As a corollary of part (a) we have that $$|B(S',m)|=2^r$$. Furthermore, as a corollary of part (b), we can also conclude that for each block in $$B(S',m)$$, exactly $$2^{m-r}$$ numbers in $$S$$ map to that block.

### d.

There are $$(2^n)!$$ permutations of $$S\_n$$. The number of linear permutations is upper bounded by $$|A|\times|c|$$, where $$|A|=2^{n^2}$$ (total count of $$n \times n$$ 0-1 matrices) and $$|c|=2^n$$ (total count of $$n$$-bit vectors). For $$n \ge 3$$ it holds that $$(2^n)!>2^{n(n+1)}$$.

### e.

We start by answering the hint from the book. For a given permutation, multiplying a matrix with a unit vector $$e\_i$$ results in selecting the ith column vector for $$i=1,2,\dots,n$$. By part (d) the smallest $$n$$ is 3, so  choose $$S\_3$$ with the following partial permutation $$\pi\_{A,c}$$: $$\pi\_{A,c}(0)=0$$, $$\pi\_{A,c}(1)=2$$, $$\pi\_{A,c}(2)=4$$, $$\pi\_{A,c}(4)=1$$. Now all parameters are fixed. Currently, 3 is mapped to 6, so just choose something else. Here is the rest of the permutation that cannot be achieved with a linear permutation model: $$\pi\_{A,c}(3)=7$$, $$\pi\_{A,c}(7)=3$$, $$\pi\_{A,c}(6)=6$$, $$\pi\_{A,c}(5)=5$$.
