> For the complete documentation index, see [llms.txt](https://evarga.gitbook.io/sh-intro-to-algs/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://evarga.gitbook.io/sh-intro-to-algs/part-viii-appendix-mathematical-background/appendix-d.md).

# D. Matrices

## Exercises

### D.1-1 🌟

{% hint style="success" %}
Shows that if $$A$$ and $$B$$ are symmetric $$n \times n$$ matrices, then so is $$A ± B$$.
{% endhint %}

For all $$i=1,2,\dots,n$$ and $$j=1,2,\dots,n$$ we have

$$
\begin{align\*}
c\_{ij} &= a\_{ij}+b\_{ij} \\
&= a\_{ji}+b\_{ji} \\
&= c\_{ji}, \\
d\_{ij} &= a\_{ij}-b\_{ij} \\
&= a\_{ji}-b\_{ji} \\
&= d\_{ji},
\end{align\*}
$$

so $$C=A+B$$ and $$D=A-B$$ are also symmetric matrices.

### D.1-2 🌟

{% hint style="success" %}
Proves that $$(AB)^T = B^TA^T$$ and that $$A^TA$$ is always a symmetric matrix.
{% endhint %}

Let $$A$$ be a matrix of size $$p \times q$$ and $$B$$ a matrix of size $$q \times r$$. We have

$$
\begin{align\*}
c\_{ji}^T =c\_{ij}&= \sum\_{k=1}^qa\_{ik}b\_{kj} && \text{(} C^T=(AB)^T \text{)} \\
&= \sum\_{k=1}^qb\_{kj}a\_{ik} \\
&= \sum\_{k=1}^q{b\_{jk}^T}{a\_{ki}^T} && \text{(} C^T=(B^TA^T) \text{)}.
\end{align\*}
$$

Also, $$(A^TA)^T=A^T(A^T)^T=A^TA$$, so $$A^TA$$ is always a symmetric matrix.

### D.1-3 🌟

{% hint style="success" %}
Proves that the product of two lower-triangular matrices is lower-triangular.
{% endhint %}

Let $$A$$ and $$B$$ be $$n \times n$$ lower-triangular matrices and  $$C=AB$$. For all $$1 \le i < j \le n$$ we have

$$
\begin{align\*}\
c\_{ij} &= \sum\_{k=1}^na\_{ik}b\_{kj} \\
&= \sum\_{k=1}^{j-1}a\_{ik}b\_{kj}+\sum\_{k=j}^na\_{ik}b\_{kj} \\
&= \sum\_{k=1}^{j-1}a\_{ik}\cdot0+\sum\_{k=j}^n0\cdot b\_{kj}  \\
&= 0.
\end{align\*}
$$

Therefore, the matrix $$C$$ is also lower-triangular.

### D.1-4 🌟

{% hint style="success" %}
Proves that if $$P$$ is an $$n \times n$$ permutation matrix and $$A$$ is an $$n \times n$$ matrix, then the matrix product $$PA$$ is $$A$$ with its rows permuted, and the matrix product $$AP$$ is $$A$$ with its columns permuted. Proves that the product of two permutation matrices is a permutation matrix.
{% endhint %}

As explained in the book, multiplying a vector $$x$$ by a permutation matrix has the effect of permuting (rearranging) the elements of $$x$$. We can represent a matrix as a sequence of column vectors $$A=\braket{c\_1,c\_2,\dots,c\_n}$$. Thus, $$PA=\braket{Pc\_1,Pc\_2,\dots,Pc\_n}$$ is a matrix where each column vector of $$A$$ is permuted in the same way. In other words, rows of $$A$$ are permuted. Analogously, we can view a matrix as a sequence of row vectors $$A=\braket{r\_1^T,r\_2^T,\dots,r\_n^T}$$, hence $$AP=\braket{r\_1^TP,r\_2^TP,\dots,r\_n^TP}$$ is a matrix where each row vector of $$A$$ is permuted in the same way. In other words, columns of $$A$$ are permuted.

The previous explanation proves the statement from the book at a high level. Let us see where elements of $$A$$ land in $$PA$$. Row $$i$$ of $$P$$ contains only a single 1 at index $$q$$, so $$p\_{iq}=1$$ and 0s elsewhere. We have

$$
\begin{align\*}\
a\_{ij} &= \sum\_{k=1}^np\_{ik}a\_{kj} \\\
&= p\_{iq}a\_{qj} \\\
&= a\_{qj}.
\end{align\*}
$$

Notice that all elements from row $$q$$ of $$A$$ goes into row $$i$$. In other words, rows of $$A$$ are permuted. This works since each distinct row of $$P$$ has a unique column index such that $$P$$ has 1 there. A similar observation applies to $$AP$$ just that instead of permuting rows we permute columns.

Apparently, if we replace $$A$$ with $$P$$, then $$PP$$ is another permutation matrix, as permuting columns or rows of $$P$$ still maintains the property that it will have exactly one 1 in each row or column, and 0s elsewhere.

### D.2-1

Suppose, for contradiction, that $$B \neq C$$. We have

$$
B=BI=B(AC)=(BA)C=IC=C.
$$

which contradicts our initial assumption of distinct inverses. Therefore, matrix inverses are unique.

### D.2-2 🌟

{% hint style="success" %}
Proves that the determinant of a lower-triangular or upper-triangular matrix is equal to the product of its diagonal elements. Proves that the inverse of a lower-triangular matrix, if it exists, is lower-triangular.
{% endhint %}

Let $$A$$ be an $$n \times n$$ lower-triangular matrix. We prove by induction on $$n$$ that $$\det (A)$$ is equal to the product of its diagonal elements. The base case holds; when $$n=1$$ then $$\det(A)=a\_{11}$$. For $$n >1$$ we have

$$
\begin{align\*}
\det(A) &= \sum\_{j=1}^n(-1)^{1+j}a\_{1j}\det(A\_{\[1j]}) \\
&= (-1)^{1+1}a\_{11}\det(A\_{\[11]}) \\
&= a\_{11}\prod\_{i=2}^na\_{ii} && \text{(by the inductive hypothesis)} \\
&= \prod\_{i=1}^na\_{ii}.
\end{align\*}
$$

According to the principles of mathematical induction, we can conclude that the statement holds for all $$n \ge 1$$. For an upper-triangular matrix the proof is nearly identical with a difference that we start at $$a\_{nn}$$ and work backwards.

Suppose, for contradiction, that $$B=A^{-1}$$ is not lower-triangular, thus $$i$$th row of $$B$$  has additional nonzero entries $$b\_{i(i+1)},b\_{i(i+2)},\dots,b\_{ij}$$ for  $$i < j \le n$$ (starting at column $$j=n$$ and iterating backwards). We know that $$BA=I$$ and the $$i$$th row of $$I$$ has only one nonzero element at column $$i$$. Since $$A$$ is lower-triangular, the first nonzero element in the $$j$$th column is at row $$j$$, hence $$e\_{ij}=b\_{ij}a\_{jj}=0 \implies b\_{ij}=0$$. But this entails a cascading effect $$b\_{i(j-1)}a\_{(j-1)(j-1)}+0 \cdot a\_{j(j-1)}=0 \implies b\_{i(j-1)}=0$$ and so on. It turns out that $$b\_{ij}=0$$ for all $$j>i$$, which contradicts our statement that $$B$$ is not lower-triangular. Observe that all diagonal elements of $$A$$ must be nonzero, otherwise $$A$$ would be singular.

### D.2-3

Wikipedia has a proof [here](https://en.wikipedia.org/wiki/Permutation_matrix#The_transpose_is_also_the_inverse) that $$P$$ is invertible and $$P^{-1}=P^T$$.&#x20;

$$P^T$$ maintains the property that it has exactly one 1 in each row or column, and 0s elsewhere, since transposing a matrix swaps its columns and rows. Therefore, $$P^T$$ is also a permutation matrix.

### D.2-4

We know that $$a\_i^T \cdot b\_j=0$$, where $$i \neq j$$, and $$a\_i^T \cdot b\_i=1$$; $$a\_*^T$$ is a row vector of $$A$$ and $$b\_*$$ is a column vector of $$B$$. All row vectors of $$A$$, except row $$i$$, are the same in $$A'$$. All column vectors of $$B$$, except column $$j$$, are the same in $$B'$$. It is stated that $$a\_i'^T=a\_i^T+a\_j^T$$ and $$b\_j'=b\_j-b\_i$$ where $$i \neq j$$. $$A'B'=I$$ since for all distinct $$i$$, $$j$$ and $$k$$ we still have

$$
\begin{align\*}
a\_i'^T \cdot b\_j' &= a\_i ^T\cdot b\_j - a\_i^T \cdot b\_i + a\_j ^T\cdot b\_j - a\_j^T \cdot b\_i=0-1+1-0=0 \\
a\_k^T \cdot b\_j' &= a\_k^T \cdot b\_j - a\_k^T \cdot b\_i=0-0=0 \\
a\_i'^T \cdot b\_k &= a\_i^T \cdot b\_k + a\_j^T \cdot b\_k=0+0=0.
\end{align\*}
$$

On the other hand, $$a\_i'^T\cdot b\_i=a\_j^T \cdot b\_j'=a\_k^T \cdot b\_k=1$$, hence $$A'B'$$ has ones on the main diagonal and 0s elsewhere.

### D.2-5

We only prove one direction, as the opposite is symmetrical due to $$(A^{-1})^{-1}=A$$. Assume every entry of $$A$$ is real. Let its inverse $$B=A^{-1}$$ be represented as a sequence of complex column vectors $$b\_j=x\_j+iy\_j$$ for $$j=1,2,\dots,n$$. Here, real $$n$$-vectors $$x\_j$$ and $$y\_j$$ denote the real and imaginary part of $$b\_j$$, respectively. We know that $$AB=I$$, hence $$Ab\_j=Ax\_j+iAy\_j=e\_j$$ is a real $$n$$-vector with single 1 at index $$j$$ and 0s elsewhere. Consequently, $$Ay\_j=0 \implies y\_j=0$$ based on Theorems D.1 and D.2. The previous reasoning applies for all $$j=1,2,\dots,n$$, thus $$B$$ is also a real matrix.

### D.2-6

$$
\begin{align\*}
(A^{-1})^T&= (A^T)^{-1} = A^{-1}. \\
\hline \\
(BAB^T)^T &= ((BA)B^T)^T  \\
&= (B^T)^T(BA)^T \\
&= BA^TB^T \\
&= BAB^T.
\end{align\*}
$$

### D.2-7

Let $$A=\braket{a\_1,a\_2,\dots,a\_n}$$ be a matrix represented as a sequence of column vectors and let $$x$$ be an $$n$$-vector, thus $$Ax=\sum\_{i=1}^n a\_ix\_i$$. If $$A$$ has a full column rank, then all column vectors are linearly independent. Therefore, $$Ax=0 \implies x=0$$. In the opposite direction, assume, for contradiction, that the column rank of $$A$$ is less than $$n$$ even though $$A$$ does not have a null vector. Consequently, the column vectors are linearly dependent, so there is some $$x \neq 0$$ such that $$Ax=0$$. But this contradicts the initial assumption of $$Ax=0 \implies x=0$$. This concludes the proof of Theorem D.2.

### D.2-8 🌟

{% hint style="success" %}
Proves that for any two compatible matrices $$A$$ and $$B$$, $$\text{rank}(AB) ≤ \min {\text{rank}(A),\text{rank}(B)}$$,\
where equality holds if either $$A$$ or $$B$$ is a nonsingular square matrix.
{% endhint %}

Let $$A\_{m \times p}$$ and $$B\_{p \times n}$$ be matrices whose ranks are $$r\_A$$ and $$r\_B$$, respectively. According to the alternate definition of the rank of a matrix, we have that $$A=C'*{m \times r\_A}D'*{r\_A \times p}$$ and $$B=C''*{p \times r\_B}D''*{r\_B \times n}$$. Let $$(AB)*{m \times n}$$ be a matrix whose rank is $$r$$, so $$AB=C*{m \times r}D\_{r \times n}$$. We can express $$AB=C'D'C''D''$$ in two different ways as follows:

* $$AB=(C'D'C'')*{m \times r\_B}D''*{r\_B \times n} \implies r \le r\_B$$
* $$AB=C\_{m \times r\_A}'(D'C''D'')\_{r\_A \times n} \implies r \le r\_A$$

Therefore, we can conclude that $$r \le \min{r\_A,r\_B}$$.&#x20;

Now, assume that $$A\_{m \times m}$$ is a nonsingular matrix, so $$r\_A=m$$. If $$B\_{m \times m}$$ is also a nonsingular matrix then obviously $$r\_B=m$$ and $$r=m$$, so equality holds. Thus, let $$B\_{m \times n}$$ be an arbitrary compatible matrix with rank $$r\_B$$. Since $$A$$ is nonsingular ($$r\_A = m$$), we know that $$r\_B \le r\_A$$. We have

$$
B = A^{-1}AB = (A^{-1}C)*{m \times r}D*{r \times n} \implies r\_B \le r.
$$

Consequently, $$(r \le r\_B \land r \ge r\_B) \implies r= r\_B=\min{r\_A,r\_B}$$. The mirror case, where only B is nonsingular, is proven analogously.

## Problems

### D-1 Vandermonde matrix

We prove the statement from the book using induction on $$n$$. The base case $$n=1$$ trivially holds, as $$V(x\_0)=\prod\_{0\le j\<k\le0}(x\_k-x\_j)=1$$.&#x20;

For $$n>1$$ we rely on Theorem D.4, while applying the hint from the book, which says that a determinant is unchanged if entries in one row/column are added to those in another row/column. This can be extended to the case where entries are previously scaled by some factor $$\lambda \neq 0$$. Theorem D.4 also states that a determinant is multiplied by such factor if all entries in some row/column are multiplied by it. So, we can simply multiply all entries by $$\lambda$$, add such scaled entries to another row/column, and divide the previously multiplied elements by $$\lambda$$.

We transform the Vandermonde matrix $$V$$, by following the hint, to get

$$
V'=\begin{pmatrix}
1 & 0 & 0 & 0 & \cdots & 0 \\\[2mm]
1 & x\_1-x\_0 & x\_1(x\_1-x\_0) & x\_1^2(x\_1-x\_0) & \cdots & x\_1^{n-2}(x\_1-x\_0) \\\[2mm]
1 & x\_2-x\_0 & x\_2(x\_2-x\_0) & x\_2^2(x\_2-x\_0) & \cdots & x\_2^{n-2}(x\_2-x\_0) \\\[2mm]
\vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\\[2mm]
1 & x\_{n-1}-x\_0 & x\_{n-1}(x\_{n-1}-x\_0) & x\_{n-1}^2(x\_{n-1}-x\_0) & \cdots & x\_{n-1}^{n-2}(x\_{n-1}-x\_0)
\end{pmatrix}.
$$

Each row is scaled by the factor $$x\_i-x\_0$$ for $$i=1,2,\dots,n-1$$, so we have

$$
\begin{align\*}
\det(V(x\_0,x\_1,\dots,x\_{n-1})) &= \det(V') \\
&= \det(V'*{\[11]}) \\
&= \left(\prod*{i=1}^{n-1}(x\_i-x\_0)\right)\det(V(x\_1,x\_2,\dots,x\_{n-1})) \\
&= \prod\_{i=1}^{n-1}(x\_i-x\_0)!!!!!\prod\_{1\le j\<k\le n-1}!!!!!!(x\_k-x\_j) && \text{(by the inductive hypothesis)}\\
&= !!!!!!!!\prod\_{0\le j\<k\le n-1}!!!!!!(x\_k-x\_j).
\end{align\*}
$$

According to the principles of mathematical induction, we can conclude that the statement from the book holds for all $$n \ge 1$$.

### D-2 Permutations defined by matrix-vector multiplication over GF(2)

#### a.

Let $$A$$ be a matrix represented as a sequence of column vectors $$a\_i$$ for $$i=0,1,\dots,n-1$$. WLOG assume that the first $$r$$ column vectors are linearly independent, otherwise we would explicitly work with a set of $$r$$ indices $$J \subseteq{0,1,\dots,n-1}$$ that denote those linearly independent objects. We prove the upper and lower bounds separately.

For any $$n$$-bit vector $$x$$ we can always find a matching $$n$$-bit vector $$y$$, with 0s on the last $$n-r$$ positions, such that $$Ax=Ay$$. This is attainable, as the first $$r$$ linearly independent column vectors of $$A$$ can represent all possible combinations of the last $$n-r$$ dependent vectors. There are at most $$2^r$$ ways of choosing $$y$$ for all possible inputs $$x$$, hence $$|R(A)| \le 2^r$$.

We claim that any unique combination of the first $$r$$ bits of an $$n$$-bit vector $$x$$ produces a distinct image $$Ax$$. Suppose, for contradiction, that there is another $$n$$-bit vector $$y$$ such that $$Ax=Ay$$. Assume that both $$x$$ and $$y$$ has 0s on the last $$n-r$$ bit positions. Therefore, $$A(x-y)=0$$, so $$x-y \neq 0$$ is a null vector of $$A$$. But this contradicts the assumption that $$A$$ has rank $$r$$, since it turns out that neither the first $$r$$ column vectors are linearly independent (see also [Exercise D.2-7](#d.2-7)). Thus, $$|R(A)| \ge 2^r$$.

$$(|R(A)| \le 2^r \land |R(A)| \ge 2^r) \implies |R(A)| =2^r$$ and $$r\<n \implies |R(A)| < 2^n=|S\_n|$$. Therefore, $$A$$ must have a full rank to define a permutation on $$S\_n$$.

#### b.

Suppose, for contradiction, there is some $$y$$ such that $$|P(A,y)|>2^{n−r}$$. By the pigeonhole principle there would be at least one duplicated segment of $$n-r$$ bits linked to two different segments of $$r$$ bits associated with linearly independent columns of $$A$$. But this would mean that $$|R(A)|<2^r$$, which is a contradiction (see part (a)). Therefore, $$|P(A,y)|\le2^{n−r}$$. On the other hand, $$|P(A,y)|\ge2^{n−r}$$ since no $$y$$ could have preimage of size less then $$2^{n-r}$$ without some other having more, which we have just shown is impossible. We know that $$|S\_n|=2^n$$ and everything must add up. We can conclude that $$|P(A,y)|=2^{n−r}$$.

#### c.

At first glance, this task seems complicated, but it is essentially a corollary of parts (a) and (b). Let $$A$$ be a matrix represented as a sequence of column vectors $$a\_i$$ for $$i=0,1,\dots,n-1$$ as before. An image of $$x \in S$$ is a linear combination of these vectors $$y=Ax=\sum\_{i=0}^{n-1} a\_ix\_i$$. The identifier of the block where $$y$$ belongs spans its last $$n-m$$ bit positions, while the first $$m$$ bits specify the offset. Recall that every block is of size $$2^m$$. This mirrors the structure of each column vector of $$A$$, hence only the last $$n-m$$ rows of $$A$$ determine the block identifier. Inputs $$x$$ from separate blocks differ by a constant factor $$i2^m$$, so their images $$y$$ will be shifted by the same amount. WLOG we may assume that our source block $$S$$ is block 0. Consequently, all input $$n$$-bit vectors $$x$$ have 0s in the last $$n-m$$ positions, thus only the leftmost $$m$$ columns of $$A$$ are relevant. Evidently, all the remaining columns of $$A$$ would be multiplied by 0s during the matrix-vector multiplication $$Ax$$.

A fundamental property of any matrix $$A$$ is that its row rank always equals its column rank. The fact that $$r$$ is the rank of the lower-left $$(n-m) \times m$$ submatrix of $$A$$ implies that there are $$r$$ linearly independent column vectors among the leftmost $$m$$ columns of $$A$$. Every distinct combination of these vectors produces a unique block identifier. As a corollary of part (a) we have that $$|B(S',m)|=2^r$$. Furthermore, as a corollary of part (b), we can also conclude that for each block in $$B(S',m)$$, exactly $$2^{m-r}$$ numbers in $$S$$ map to that block.

#### d.

There are $$(2^n)!$$ permutations of $$S\_n$$. The number of linear permutations is upper bounded by $$|A|\times|c|$$, where $$|A|=2^{n^2}$$ (total count of $$n \times n$$ 0-1 matrices) and $$|c|=2^n$$ (total count of $$n$$-bit vectors). For $$n \ge 3$$ it holds that $$(2^n)!>2^{n(n+1)}$$.

#### e.

We start by answering the hint from the book. For a given permutation, multiplying a matrix with a unit vector $$e\_i$$ results in selecting the $$i$$th column vector for $$i=1,2,\dots,n$$. By part (d) the smallest $$n$$ is 3, so choose $$S\_3$$ with the following partial permutation $$\pi\_{A,c}$$: $$\pi\_{A,c}(0)=0$$, $$\pi\_{A,c}(1)=2$$, $$\pi\_{A,c}(2)=4$$, $$\pi\_{A,c}(4)=1$$. Now all parameters are fixed. Currently, 3 is mapped to 6, so just choose something else. Here is the rest of the permutation that cannot be achieved with a linear permutation model: $$\pi\_{A,c}(3)=7$$, $$\pi\_{A,c}(7)=3$$, $$\pi\_{A,c}(6)=6$$, $$\pi\_{A,c}(5)=5$$.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://evarga.gitbook.io/sh-intro-to-algs/part-viii-appendix-mathematical-background/appendix-d.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
