Appendix D

Exercise D.1-1

For all i=1,2,,ni=1,2,\dots,n and j=1,2,,nj=1,2,\dots,n we have

cij=aij+bij=aji+bji=cjidij=aijbij=ajibji=dji\begin{align*} c_{ij} &= a_{ij}+b_{ij} \\ &= a_{ji}+b_{ji} \\ &= c_{ji} \\\\ d_{ij} &= a_{ij}-b_{ij} \\ &= a_{ji}-b_{ji} \\ &= d_{ji} \end{align*}

so C=A+BC=A+B and D=ABD=A-B are also symmetric matrices.

Exercise D.1-2

Let AA be a matrix of size p×qp \times q and BB a matrix of size q×rq \times r.We have

cjiT=cij=k=1qaikbkj(CT=(AB)T)=k=1qbjkTakiT(CT=(BTAT))\begin{align*} c_{ji}^T =c_{ij}&= \sum_{k=1}^qa_{ik}b_{kj} && \text{(} C^T=(AB)^T \text{)} \\ &= \sum_{k=1}^q{b_{jk}^T}{a_{ki}^T} && \text{(} C^T=(B^TA^T) \text{)} \end{align*}

Also, (ATA)T=AT(AT)T=ATA(A^TA)^T=A^T(A^T)^T=A^TA, so ATAA^TA is always a symmetric matrix.

Exercise D.1-3

Let AA and BB be n×nn \times n lower-triangular matrices and C=ABC=AB. For all 1i<jn1 \le i < j \le n we have

cij=k=1naikbkj=k=1j1aikbkj+k=jnaikbkj=k=1j1aik0+k=jn0bkj=0 \begin{align*} c_{ij} &= \sum_{k=1}^na_{ik}b_{kj} \\ &= \sum_{k=1}^{j-1}a_{ik}b_{kj}+\sum_{k=j}^na_{ik}b_{kj} \\ &= \sum_{k=1}^{j-1}a_{ik}\cdot0+\sum_{k=j}^n0\cdot b_{kj} \\ &= 0 \end{align*}

Therefore, the matrix CC is also lower-triangular.

Exercise D.1-4

As explained in the book, multiplying a vector xx by a permutation matrix has the effect of permuting (rearranging) the elements of xx. We can represent a matrix as a sequence of column vectors A=c1,c2,,cnA=\braket{c_1,c_2,\dots,c_n}. Thus, PA=Pc1,Pc2,,PcnPA=\braket{Pc_1,Pc_2,\dots,Pc_n} is a matrix where each column vector of AA is permuted in the same way. In other words, rows of AA are permuted. Analogously, we can view a matrix as a sequence of row vectors A=r1T,r2T,,rnTA=\braket{r_1^T,r_2^T,\dots,r_n^T}, hence AP=r1TP,r2TP,,rnTPAP=\braket{r_1^TP,r_2^TP,\dots,r_n^TP} is a matrix where each row vector of AA is permuted in the same way. In other words, columns of AA are permuted.

The previous explanation proves the statement from the book at high level. Let us see where elements of AA land in PAPA. Row ii of PP contains only a single 1 at index qq, so piq=1p_{iq}=1 and 0s elsewhere. We have

aij=k=1npikakj=piqaqj=aqj\begin{align*} a_{ij} &= \sum_{k=1}^np_{ik}a_{kj} \\ &= p_{iq}a_{qj} \\ &= a_{qj} \end{align*}

Notice that all elements from row qq of AA goes into row ii. In other words, rows of AA are permuted. This works since each distinct row of PP has a unique column index such that PP has 1 there. A similar observation applies to APAP just that instead of permuting rows we permute columns.

Apparently, if we replace AA with PP, then PPPP is another permutation matrix, as permuting columns or rows of PP still maintains the property that it will have exactly one 1 in each row or column, and 0s elsewhere.

Exercise D.2-1

Suppose, for contradiction, that BCB \neq C. We have

B=BI=B(AC)=(BA)C=IC=CB=BI=B(AC)=(BA)C=IC=C

which contradicts our initial assumption of distinct inverses. Therefore, matrix inverses are unique.

Exercise D.2-2

Let AA be an n×nn \times n lower-triangular matrix. We prove by induction on nn that det(A)\det (A) is equal to the product of its diagonal elements. The base case holds; when n=1n=1 then det(A)=a11\det(A)=a_{11}. For n>1n >1 we have

det(A)=j=1n(1)1+ja1jdet(A[1j])=(1)1+1a11det(A[11])=a11i=2naii(by the inductive hypothesis)=i=1naii\begin{align*} \det(A) &= \sum_{j=1}^n(-1)^{1+j}a_{1j}\det(A_{[1j]}) \\ &= (-1)^{1+1}a_{11}\det(A_{[11]}) \\ &= a_{11}\prod_{i=2}^na_{ii} && \text{(by the inductive hypothesis)} \\ &= \prod_{i=1}^na_{ii} \end{align*}

According to the principles of mathematical induction, we can conclude that the statement holds for all n1n \ge 1. For an upper-triangular matrix the poof is nearly identical with a difference that we start at anna_{nn} and work backwards.

Suppose, for contradiction, that B=A1B=A^{-1} is not lower-triangular, thus ith row of BB has additional nonzero entries bi(i+1),bi(i+2),,bijb_{i(i+1)},b_{i(i+2)},\dots,b_{ij} for i<jni < j \le n. We know that BA=IBA=I and the ith row of II has only one nonzero element at column ii. Since AA is lower-triangular, the first nonzero element in the jth column is at row jj, hence eij=bijajj=0    bij=0e_{ij}=b_{ij}a_{jj}=0 \implies b_{ij}=0; eij=0e_{ij}=0, where iji \neq j, and bik=0b_{ik}=0 for all j<knj <k \le n. But this entails a cascading effect bi(j1)a(j1)(j1)+0aj(j1)=0    bi(j1)=0b_{i(j-1)}a_{(j-1)(j-1)}+0 \cdot a_{j(j-1)}=0 \implies b_{i(j-1)}=0 and so on. It turns out that bij=0b_{ij}=0 for all j>ij>i, which contradicts our statement that BB is not lower-triangular. Observe that all diagonal elements of AA must be nonzero, otherwise AA would be singular.

Exercise D.2-3

Wikipedia has a proof here that PP is invertible and P1=PTP^{-1}=P^T.

PTP^T maintains the property that it has exactly one 1 in each row or column, and 0s elsewhere, since transposing a matrix swaps its columns and rows. Therefore, PTP^T is also a permutation matrix.

Exercise D.2-4

We know that aiTbj=0a_i^T \cdot b_j=0, where iji \neq j, and aiTbi=1a_i^T \cdot b_i=1; aTa_*^T is a row vector of AA and bb_* is a column vector of BB. All row vectors of AA, except row ii, are the same in AA'. All column vectors of BB, except column jj, are the same in BB'. It is stated that aiT=aiT+ajTa_i'^T=a_i^T+a_j^T and bj=bjbib_j'=b_j-b_i where iji \neq j. AB=IA'B'=I since for all distinct ii, jj and kk we still have

aiTbj=aiTbjaiTbi+ajTbjajTbi=01+10=0akTbj=akTbjakTbi=00=0aiTbk=aiTbk+ajTbk=0+0=0\begin{align*} a_i'^T \cdot b_j' &= a_i ^T\cdot b_j - a_i^T \cdot b_i + a_j ^T\cdot b_j - a_j^T \cdot b_i=0-1+1-0=0 \\ a_k^T \cdot b_j' &= a_k^T \cdot b_j - a_k^T \cdot b_i=0-0=0 \\ a_i'^T \cdot b_k &= a_i^T \cdot b_k + a_j^T \cdot b_k=0+0=0 \end{align*}

On the other hand, aiTbi=ajTbj=akTbk=1a_i'^T\cdot b_i=a_j^T \cdot b_j'=a_k^T \cdot b_k=1, hence ABA'B' has ones on the main diagonal and 0s elsewhere.

Exercise D.2-5

We only prove one direction, as the opposite is symmetrical due to (A1)1=A(A^{-1})^{-1}=A. Assume every entry of AA is real. Let its inverse B=A1B=A^{-1} be represented as a sequence of complex column vectors bj=xj+iyjb_j=x_j+iy_j for j=1,2,,nj=1,2,\dots,n. Here, real nn-vectors xjx_j and yjy_j denote the real and imaginary part of bjb_j, respectively. We know that AB=IAB=I, hence Abj=Axj+iAyj=ejAb_j=Ax_j+iAy_j=e_j is a real nn-vector with single 1 at index jj and 0s elsewhere. Consequently, Ayj=0    yj=0Ay_j=0 \implies y_j=0 based on Theorem D.1 and Theorem D.2. The previous reasoning applies for all j=1,2,,nj=1,2,\dots,n, thus BB is also a real matrix.

Exercise D.2-6

(A1)T=(AT)1=A1(BABT)T=((BA)BT)T=(BT)T(BA)T=BATBT=BABT\begin{align*} (A^{-1})^T&= (A^T)^{-1} = A^{-1} \\ \\ (BAB^T)^T &= ((BA)B^T)^T \\ &= (B^T)^T(BA)^T \\ &= BA^TB^T \\ &= BAB^T \end{align*}

Exercise D.2-7

Let A=a1,a2,,anA=\braket{a_1,a_2,\dots,a_n} be a matrix represented as a sequence of column vectors and let xx be an nn-vector, thus Ax=i=1naixiAx=\sum_{i=1}^n a_ix_i. If AA has a full column rank, then all column vectors are linearly independent. Therefore, Ax=0    x=0Ax=0 \implies x=0. In the opposite direction, assume, for contradiction, that the column rank of AA is less than nn even though AA does not have a null vector. Consequently, the column vectors are linearly dependent, so there is some x0x \neq 0 such that Ax=0Ax=0. But this contradicts the initial assumption of Ax=0    x=0Ax=0 \implies x=0. This concludes the proof of Theorem D.2.

Exercise D.2-8

Let Am×pA_{m \times p} and Bp×nB_{p \times n} be matrices whose ranks are rAr_A and rBr_B, respectively. According to the alternate definition of the rank of a matrix, we have that A=Cm×rADrA×pA=C'_{m \times r_A}D'_{r_A \times p} and B=Cp×rBDrB×nB=C''_{p \times r_B}D''_{r_B \times n}. Let (AB)m×n(AB)_{m \times n} be a matrix whose rank is rr, so AB=Cm×rDr×nAB=C_{m \times r}D_{r \times n}. We can express AB=CDCDAB=C'D'C''D'' in two different ways as follows:

  • AB=(CDC)m×rBDrB×n    rrBAB=(C'D'C'')_{m \times r_B}D''_{r_B \times n} \implies r \le r_B

  • AB=Cm×rA(DCD)rA×n    rrAAB=C_{m \times r_A}'(D'C''D'')_{r_A \times n} \implies r \le r_A

Therefore, we can conclude that rmin{rA,rB}r \le \min\{r_A,r_B\}.

Now, assume that Am×mA_{m \times m} is a nonsingular matrix, so rA=mr_A=m. If Bm×mB_{m \times m} is also a nonsingular matrix then obviously rB=mr_B=m and r=mr=m, so equality holds. Thus, let Bm×nB_{m \times n} be a matrix whose rank is rB=min{m,n}rAr_B=\min\{m,n\} \le r_A. We have

B=A1AB=(A1C)m×rDr×n    rBrB = A^{-1}AB = (A^{-1}C)_{m \times r}D_{r \times n} \implies r_B \le r

Consequently, (rrBrrB)    r=rB=min{rA,rB}(r \le r_B \land r \ge r_B) \implies r= r_B=\min\{r_A,r_B\}. The mirror case, where only B is nonsingular, is proven analogously.

Problem D-1

We prove the statement from the book using induction on nn. The base case n=1n=1 trivially holds, as V(x0)=0j<k0(xkxj)=1V(x_0)=\prod_{0\le j<k\le0}(x_k-x_j)=1.

For n>1n>1 we rely on Theorem D.4, while applying the hint from the book, which says that a determinant is unchanged if entries in one row/column are added to those in another row/column. This can be extended to the case where entries are previously scaled by some factor λ0\lambda \neq 0. Theorem D.4 also states that a determinant is multiplied by such factor if all entries in some row/column are multiplied by it. So, we can simply multiply all entries by λ\lambda, add such scaled entries to another row/column, and divide the previously multiplied elements by λ\lambda.

We transform the Vandermonde matrix VV, by following the hint, to get

V=(100001x1x0x1(x1x0)x12(x1x0)x1n2(x1x0)1x2x0x2(x2x0)x22(x2x0)x2n2(x2x0)1xn1x0xn1(xn1x0)xn12(xn1x0)xn1n2(xn1x0)) V'=\begin{pmatrix} 1 & 0 & 0 & 0 & \cdots & 0 \\[2mm] 1 & x_1-x_0 & x_1(x_1-x_0) & x_1^2(x_1-x_0) & \cdots & x_1^{n-2}(x_1-x_0) \\[2mm] 1 & x_2-x_0 & x_2(x_2-x_0) & x_2^2(x_2-x_0) & \cdots & x_2^{n-2}(x_2-x_0) \\[2mm] \vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\[2mm] 1 & x_{n-1}-x_0 & x_{n-1}(x_{n-1}-x_0) & x_{n-1}^2(x_{n-1}-x_0) & \cdots & x_{n-1}^{n-2}(x_{n-1}-x_0) \end{pmatrix}

Each row is scaled by the factor xix0x_i-x_0 for i=1,2,,n1i=1,2,\dots,n-1, so we have

det(V(x1,x2,,xn1))=det(V)=det(V[11])=i=1n1(xix0)det(V(x1,x2,,xn1))=i=1n1(xix0) ⁣ ⁣ ⁣ ⁣ ⁣1j<kn1 ⁣ ⁣ ⁣ ⁣ ⁣ ⁣(xkxj)(by the inductive hypothesis)= ⁣ ⁣ ⁣ ⁣ ⁣ ⁣ ⁣ ⁣0j<kn1 ⁣ ⁣ ⁣ ⁣ ⁣ ⁣(xkxj)\begin{align*} \det(V(x_1,x_2,\dots,x_{n-1})) &= \det(V') \\ &= \det(V'_{[11]}) \\ &= \prod_{i=1}^{n-1}(x_i-x_0)\det(V(x_1,x_2,\dots,x_{n-1})) \\ &= \prod_{i=1}^{n-1}(x_i-x_0)\!\!\!\!\!\prod_{1\le j<k\le n-1}\!\!\!\!\!\!(x_k-x_j) && \text{(by the inductive hypothesis)}\\ &= \!\!\!\!\!\!\!\!\prod_{0\le j<k\le n-1}\!\!\!\!\!\!(x_k-x_j) \end{align*}

According to the principles of mathematical induction, we can conclude that the statement from the book holds for all n1n \ge 1.

Problem D-2

a.

Let AA be a matrix represented as a sequence of column vectors aia_i for i=0,1,,n1i=0,1,\dots,n-1. WLOG assume that the first rr column vectors are linearly independent, otherwise we would explicitly work with a set of rr indices J{0,1,,n1}J \subseteq\{0,1,\dots,n-1\} that denote those linearly independent objects. We prove the upper and lower bound separately.

For any nn-bit vector xx we can always find a matching nn-bit vector yy, with 0s on the last nrn-r positions, such that Ax=AyAx=Ay. This is attainable, as the first rr linearly independent column vectors of AA can represent all possible combinations of the last nrn-r dependent vectors. There are at most 2r2^r ways of choosing yy for all possible inputs xx, hence R(A)2r|R(A)| \le 2^r.

We claim that any unique combination of the first rr bits of an nn-bit vector xx produces a distinct image AxAx. Suppose, for contradiction, that there is another nn-bit vector yy such that Ax=AyAx=Ay. Assume that both xx and yy has 0s on the last nrn-r bit positions. Therefore, A(xy)=0A(x-y)=0, so xy0x-y \neq 0 is a null vector of AA. But this contradicts the assumption that AA has rank rr, since it turns out that neither the first rr column vectors are linearly independent (see also Exercise D.2-7). Thus, R(A)2r|R(A)| \ge 2^r.

(R(A)2rR(A)2r)    R(A)=2r(|R(A)| \le 2^r \land |R(A)| \ge 2^r) \implies |R(A)| =2^r and r<n    R(A)<2n=Snr<n \implies |R(A)| < 2^n=|S_n|. Therefore, AA must have a full rank to define a permutation on SnS_n.

b.

Suppose, for contradiction, there is some yy such that P(A,y)>2nr|P(A,y)|>2^{n−r}. By the pigeonhole principle there would be at least one duplicated segment of nrn-r bits linked to two different segments of rr bits associated with linearly independent columns of AA. But this would mean that R(A)<2r|R(A)|<2^r, which is a contradiction (see part (a)). Therefore, P(A,y)2nr|P(A,y)|\le2^{n−r}. On the other hand, P(A,y)2nr|P(A,y)|\ge2^{n−r} since no yy could have preimage of size less then 2nr2^{n-r} without some other having more, which we have just shown is impossible. We know that Sn=2n|S_n|=2^n and everything must add up. We can conclude that P(A,y)=2nr|P(A,y)|=2^{n−r}.

c.

At first glance, this task seems complicated, but it is essentially a corollary of parts (a) and (b). Let AA be a matrix represented as a sequence of column vectors aia_i for i=0,1,,n1i=0,1,\dots,n-1 as before. An image of xSx \in S is a linear combination of these vectors y=Ax=i=0n1aixiy=Ax=\sum_{i=0}^{n-1} a_ix_i. The identifier of the block where yy belongs spans its last nmn-m bit positions, while the first mm bits specify the offset. Recall that every block is of size 2m2^m. This mirrors the structure of each column vector of AA, hence only the last nmn-m rows of AA determine the block identifier. Inputs xx from separate blocks differ by a constant factor i2mi2^m, so their images yy will be shifted by the same amount. WLOG we may assume that our source block SS is block 0. Consequently, all input nn-bit vectors xx have 0s in the last nmn-m positions, thus only the leftmost mm columns of AA are relevant. Evidently, all the remaining columns of AA would be multiplied by 0s during the matrix-vector multiplication AxAx.

A fundamental property of any matrix AA is that its row rank always equals its column rank. The fact that rr is the rank of the lower-left (nm)×m(n-m) \times m submatrix of AA implies that there are rr linearly independent column vectors among the leftmost mm columns of AA. Every distinct combination of these vectors produces a unique block identifier. As a corollary of part (a) we have that B(S,m)=2r|B(S',m)|=2^r. Furthermore, as a corollary of part (b), we can also conclude that for each block in B(S,m)B(S',m), exactly 2mr2^{m-r} numbers in SS map to that block.

d.

There are (2n)!(2^n)! permutations of SnS_n. The number of linear permutations is upper bounded by A×c|A|\times|c|, where A=2n2|A|=2^{n^2} (total count of n×nn \times n 0-1 matrices) and c=2n|c|=2^n (total count of nn-bit vectors). For n3n \ge 3 it holds that (2n)!>2n(n+1)(2^n)!>2^{n(n+1)}.

e.

We start by answering the hint from the book. For a given permutation, multiplying a matrix with a unit vector eie_i results in selecting the ith column vector for i=1,2,,ni=1,2,\dots,n. By part (d) the smallest nn is 3, so choose S3S_3 with the following partial permutation πA,c\pi_{A,c}: πA,c(0)=0\pi_{A,c}(0)=0, πA,c(1)=2\pi_{A,c}(1)=2, πA,c(2)=4\pi_{A,c}(2)=4, πA,c(4)=1\pi_{A,c}(4)=1. Now all parameters are fixed. Currently, 3 is mapped to 6, so just choose something else. Here is the rest of the permutation that cannot be achieved with a linear permutation model: πA,c(3)=7\pi_{A,c}(3)=7, πA,c(7)=3\pi_{A,c}(7)=3, πA,c(6)=6\pi_{A,c}(6)=6, πA,c(5)=5\pi_{A,c}(5)=5.

Last updated