Appendix D
Exercise D.1-1
For all i=1,2,…,n and j=1,2,…,n we have
so C=A+B and D=A−B are also symmetric matrices.
Exercise D.1-2
Let A be a matrix of size p×q and B a matrix of size q×r.We have
Also, (ATA)T=AT(AT)T=ATA, so ATA is always a symmetric matrix.
Exercise D.1-3
Let A and B be n×n lower-triangular matrices and C=AB. For all 1≤i<j≤n we have
Therefore, the matrix C is also lower-triangular.
Exercise D.1-4
As explained in the book, multiplying a vector x by a permutation matrix has the effect of permuting (rearranging) the elements of x. We can represent a matrix as a sequence of column vectors A=⟨c1,c2,…,cn⟩. Thus, PA=⟨Pc1,Pc2,…,Pcn⟩ is a matrix where each column vector of A is permuted in the same way. In other words, rows of A are permuted. Analogously, we can view a matrix as a sequence of row vectors A=⟨r1T,r2T,…,rnT⟩, hence AP=⟨r1TP,r2TP,…,rnTP⟩ is a matrix where each row vector of A is permuted in the same way. In other words, columns of A are permuted.
The previous explanation proves the statement from the book at high level. Let us see where elements of A land in PA. Row i of P contains only a single 1 at index q, so piq=1 and 0s elsewhere. We have
Notice that all elements from row q of A goes into row i. In other words, rows of A are permuted. This works since each distinct row of P has a unique column index such that P has 1 there. A similar observation applies to AP just that instead of permuting rows we permute columns.
Apparently, if we replace A with P, then PP is another permutation matrix, as permuting columns or rows of P still maintains the property that it will have exactly one 1 in each row or column, and 0s elsewhere.
Exercise D.2-1
Suppose, for contradiction, that B=C. We have
which contradicts our initial assumption of distinct inverses. Therefore, matrix inverses are unique.
Exercise D.2-2
Let A be an n×n lower-triangular matrix. We prove by induction on n that det(A) is equal to the product of its diagonal elements. The base case holds; when n=1 then det(A)=a11. For n>1 we have
According to the principles of mathematical induction, we can conclude that the statement holds for all n≥1. For an upper-triangular matrix the poof is nearly identical with a difference that we start at ann and work backwards.
Suppose, for contradiction, that B=A−1 is not lower-triangular, thus ith row of B has additional nonzero entries bi(i+1),bi(i+2),…,bij for i<j≤n. We know that BA=I and the ith row of I has only one nonzero element at column i. Since A is lower-triangular, the first nonzero element in the jth column is at row j, hence eij=bijajj=0⟹bij=0; eij=0, where i=j, and bik=0 for all j<k≤n. But this entails a cascading effect bi(j−1)a(j−1)(j−1)+0⋅aj(j−1)=0⟹bi(j−1)=0 and so on. It turns out that bij=0 for all j>i, which contradicts our statement that B is not lower-triangular. Observe that all diagonal elements of A must be nonzero, otherwise A would be singular.
Exercise D.2-3
Wikipedia has a proof here that P is invertible and P−1=PT.
PT maintains the property that it has exactly one 1 in each row or column, and 0s elsewhere, since transposing a matrix swaps its columns and rows. Therefore, PT is also a permutation matrix.
Exercise D.2-4
We know that aiT⋅bj=0, where i=j, and aiT⋅bi=1; a∗T is a row vector of A and b∗ is a column vector of B. All row vectors of A, except row i, are the same in A′. All column vectors of B, except column j, are the same in B′. It is stated that ai′T=aiT+ajT and bj′=bj−bi where i=j. A′B′=I since for all distinct i, j and k we still have
On the other hand, ai′T⋅bi=ajT⋅bj′=akT⋅bk=1, hence A′B′ has ones on the main diagonal and 0s elsewhere.
Exercise D.2-5
We only prove one direction, as the opposite is symmetrical due to (A−1)−1=A. Assume every entry of A is real. Let its inverse B=A−1 be represented as a sequence of complex column vectors bj=xj+iyj for j=1,2,…,n. Here, real n-vectors xj and yj denote the real and imaginary part of bj, respectively. We know that AB=I, hence Abj=Axj+iAyj=ej is a real n-vector with single 1 at index j and 0s elsewhere. Consequently, Ayj=0⟹yj=0 based on Theorem D.1 and Theorem D.2. The previous reasoning applies for all j=1,2,…,n, thus B is also a real matrix.
Exercise D.2-6
Exercise D.2-7
Let A=⟨a1,a2,…,an⟩ be a matrix represented as a sequence of column vectors and let x be an n-vector, thus Ax=∑i=1naixi. If A has a full column rank, then all column vectors are linearly independent. Therefore, Ax=0⟹x=0. In the opposite direction, assume, for contradiction, that the column rank of A is less than n even though A does not have a null vector. Consequently, the column vectors are linearly dependent, so there is some x=0 such that Ax=0. But this contradicts the initial assumption of Ax=0⟹x=0. This concludes the proof of Theorem D.2.
Exercise D.2-8
Let Am×p and Bp×n be matrices whose ranks are rA and rB, respectively. According to the alternate definition of the rank of a matrix, we have that A=Cm×rA′DrA×p′ and B=Cp×rB′′DrB×n′′. Let (AB)m×n be a matrix whose rank is r, so AB=Cm×rDr×n. We can express AB=C′D′C′′D′′ in two different ways as follows:
AB=(C′D′C′′)m×rBDrB×n′′⟹r≤rB
AB=Cm×rA′(D′C′′D′′)rA×n⟹r≤rA
Therefore, we can conclude that r≤min{rA,rB}.
Now, assume that Am×m is a nonsingular matrix, so rA=m. If Bm×m is also a nonsingular matrix then obviously rB=m and r=m, so equality holds. Thus, let Bm×n be a matrix whose rank is rB=min{m,n}≤rA. We have
Consequently, (r≤rB∧r≥rB)⟹r=rB=min{rA,rB}. The mirror case, where only B is nonsingular, is proven analogously.
Problem D-1
We prove the statement from the book using induction on n. The base case n=1 trivially holds, as V(x0)=∏0≤j<k≤0(xk−xj)=1.
For n>1 we rely on Theorem D.4, while applying the hint from the book, which says that a determinant is unchanged if entries in one row/column are added to those in another row/column. This can be extended to the case where entries are previously scaled by some factor λ=0. Theorem D.4 also states that a determinant is multiplied by such factor if all entries in some row/column are multiplied by it. So, we can simply multiply all entries by λ, add such scaled entries to another row/column, and divide the previously multiplied elements by λ.
We transform the Vandermonde matrix V, by following the hint, to get
Each row is scaled by the factor xi−x0 for i=1,2,…,n−1, so we have
According to the principles of mathematical induction, we can conclude that the statement from the book holds for all n≥1.
Problem D-2
a.
Let A be a matrix represented as a sequence of column vectors ai for i=0,1,…,n−1. WLOG assume that the first r column vectors are linearly independent, otherwise we would explicitly work with a set of r indices J⊆{0,1,…,n−1} that denote those linearly independent objects. We prove the upper and lower bound separately.
For any n-bit vector x we can always find a matching n-bit vector y, with 0s on the last n−r positions, such that Ax=Ay. This is attainable, as the first r linearly independent column vectors of A can represent all possible combinations of the last n−r dependent vectors. There are at most 2r ways of choosing y for all possible inputs x, hence ∣R(A)∣≤2r.
We claim that any unique combination of the first r bits of an n-bit vector x produces a distinct image Ax. Suppose, for contradiction, that there is another n-bit vector y such that Ax=Ay. Assume that both x and y has 0s on the last n−r bit positions. Therefore, A(x−y)=0, so x−y=0 is a null vector of A. But this contradicts the assumption that A has rank r, since it turns out that neither the first r column vectors are linearly independent (see also Exercise D.2-7). Thus, ∣R(A)∣≥2r.
(∣R(A)∣≤2r∧∣R(A)∣≥2r)⟹∣R(A)∣=2r and r<n⟹∣R(A)∣<2n=∣Sn∣. Therefore, A must have a full rank to define a permutation on Sn.
b.
Suppose, for contradiction, there is some y such that ∣P(A,y)∣>2n−r. By the pigeonhole principle there would be at least one duplicated segment of n−r bits linked to two different segments of r bits associated with linearly independent columns of A. But this would mean that ∣R(A)∣<2r, which is a contradiction (see part (a)). Therefore, ∣P(A,y)∣≤2n−r. On the other hand, ∣P(A,y)∣≥2n−r since no y could have preimage of size less then 2n−r without some other having more, which we have just shown is impossible. We know that ∣Sn∣=2n and everything must add up. We can conclude that ∣P(A,y)∣=2n−r.
c.
At first glance, this task seems complicated, but it is essentially a corollary of parts (a) and (b). Let A be a matrix represented as a sequence of column vectors ai for i=0,1,…,n−1 as before. An image of x∈S is a linear combination of these vectors y=Ax=∑i=0n−1aixi. The identifier of the block where y belongs spans its last n−m bit positions, while the first m bits specify the offset. Recall that every block is of size 2m. This mirrors the structure of each column vector of A, hence only the last n−m rows of A determine the block identifier. Inputs x from separate blocks differ by a constant factor i2m, so their images y will be shifted by the same amount. WLOG we may assume that our source block S is block 0. Consequently, all input n-bit vectors x have 0s in the last n−m positions, thus only the leftmost m columns of A are relevant. Evidently, all the remaining columns of A would be multiplied by 0s during the matrix-vector multiplication Ax.
A fundamental property of any matrix A is that its row rank always equals its column rank. The fact that r is the rank of the lower-left (n−m)×m submatrix of A implies that there are r linearly independent column vectors among the leftmost m columns of A. Every distinct combination of these vectors produces a unique block identifier. As a corollary of part (a) we have that ∣B(S′,m)∣=2r. Furthermore, as a corollary of part (b), we can also conclude that for each block in B(S′,m), exactly 2m−r numbers in S map to that block.
d.
There are (2n)! permutations of Sn. The number of linear permutations is upper bounded by ∣A∣×∣c∣, where ∣A∣=2n2 (total count of n×n 0-1 matrices) and ∣c∣=2n (total count of n-bit vectors). For n≥3 it holds that (2n)!>2n(n+1).
e.
We start by answering the hint from the book. For a given permutation, multiplying a matrix with a unit vector ei results in selecting the ith column vector for i=1,2,…,n. By part (d) the smallest n is 3, so choose S3 with the following partial permutation πA,c: πA,c(0)=0, πA,c(1)=2, πA,c(2)=4, πA,c(4)=1. Now all parameters are fixed. Currently, 3 is mapped to 6, so just choose something else. Here is the rest of the permutation that cannot be achieved with a linear permutation model: πA,c(3)=7, πA,c(7)=3, πA,c(6)=6, πA,c(5)=5.
Last updated