• We can add two m — n matrices using mn operations in R.

• We can multiply an m — n matrix and an n — p matrix using O(mnp)

operations in R.

It is also easy to see that given an m — m matrix A, and a non-negative

integer e, we can adapt the repeated squaring algorithm discussed in §3.4

so as to compute Ae using O(len(e)) multiplications of m — m matrices, and

hence O(len(e)m3 ) operations in R.

15.2 Matrices and linear maps

Let R be a ring.

For positive integers m and n, we may naturally view R1—m and R1—n

as R-modules. If A is an m — n matrix over R, then the map σ that sends

v ∈ R1—m to vA ∈ R1—n is easily seen to be an R-linear map. Evidently,

σ is injective if and only if the rows of A are linearly independent, and σ

is surjective if and only if the rows of A span R1—n . Likewise, the map „

that sends w ∈ Rn—1 to Aw ∈ Rm—1 is also an R-linear map. Again, „ is

injective if and only if the columns of A are linearly independent, and „ is

surjective if and only if the columns of A span Rm—1 .

Thus, the matrix A de¬nes in a natural way two di¬erent linear maps,

one de¬ned in terms of multiplying a row vector on the right by A, and the

other in terms multiplying a column vector on the left by A. With either

of these interpretations as a linear map, matrix multiplication has a natural

interpretation as function composition. Let A ∈ Rm—n and B ∈ Rn—p , and

consider the product matrix C = AB. Let σA , σB , σC be the maps de¬ned

15.2 Matrices and linear maps 321

by multiplication on the right by A, B, C, and let „A , „B , „C be the maps

de¬ned by multiplication on the left by A, B, C. Then it easily follows from

the associativity of matrix multiplication that σC = σB —¦σA and „C = „A —¦„B .

We have seen how matrix/vector multiplication de¬nes a linear map. Con-

versely, we shall now see that the action of any R-linear map can be viewed

as a matrix/vector multiplication, provided the R-modules involved have

bases (which will always be the case for ¬nite dimensional vector spaces).

Let M be an R-module, and suppose that A = (±1 , . . . , ±m ), with m >

0, is a basis for M . In this setting, the ordering of the basis elements is

important, and so we refer to A as an ordered basis. Now, A de¬nes

a canonical R-module isomorphism that sends (a1 , . . . , am ) ∈ R1—m to

a1 ±1 + · · · + am ±m ∈ M . Thus, elements of M can be represented concretely

as elements of R1—m ; however, this representation depends on the choice A

of the ordered basis. The vector ’1 (±) is called the coordinate vector of

± (with respect to A).

Let N be an R-module, and suppose B = (β1 , . . . , βn ), with n > 0, is an

ordered basis for N . Just as in the previous paragraph, B de¬nes a canonical

R-module isomorphism δ : R1—n ’ N .

Now let ρ : M ’ N be an arbitrary R-linear map. For any ± ∈ M , if

± = (a1 , . . . , am ), then because ρ is R-linear, we have

m m

ρ(±) = ρ(ai ±i ) = ai ρ(±i ).

i=1 i=1

Thus, the action of ρ on M is completely determined by its action on the

±i .

Let us now de¬ne an m — n matrix T whose ith row, for i = 1, . . . , m, is

de¬ned to be δ ’1 (ρ(±i )), that is, the coordinate vector of ρ(±i ) with respect

to the ordered basis B. With T de¬ned in this way, then for any ± ∈ M we

have

δ ’1 (ρ(±)) = ’1

(±)T.

In words: if we multiply the coordinate vector of ± on the right by T , we

get the coordinate vector of ρ(±).

A special case of the above is when M = R1—m and N = R1—n , and A

and B are the standard bases for M and N (i.e., for i = 1, . . . , m, the ith

vector of A has a 1 in position i and is zero everywhere else, and similarly

for B). In this case, ρ(v) = vT for all v ∈ R1—m .

To summarize, we see that an R-linear map ρ from M to N , together with

particular ordered bases for M and N , uniquely determine a matrix T such

322 Matrices

that the action of multiplication on the right by T implements the action

of ρ with respect to the given ordered bases. There may be many ordered

bases for M and N to choose from, and di¬erent choices will in general

lead to di¬erent matrices. In any case, from a computational perspective,

the matrix T gives us an e¬cient way to compute the map ρ, assuming

elements of M and N are represented as coordinate vectors with respect to

the given ordered bases.

Of course, if one prefers, by simply transposing everything, one can equally

well represent the action of ρ in terms of the action of multiplication of a

column vector on the left by a matrix.

Example 15.1. Consider again the ring E = R[X]/(f ), where f ∈ R[X] is

monic of degree , and suppose that > 0 (see Examples 9.34, 9.43, 14.3,

and 14.22). Let f = f0 + f1 X + · · · f ’1 X ’1 + X , where f0 , . . . , f ’1 ∈ R.

Consider the element · = [X]f ∈ E. Let ρ : E ’ E be the ·-multiplication

map, that is, the map that sends ± ∈ E to ·± ∈ E. This is an R-linear

map, and the matrix T ∈ R — that represents this map with respect to the

ordered basis 1, ·, · 2 , . . . , · ’1 for E over R is readily seen to be

«

···

0 1 0 0

···

¬0 0 1 0·

¬ ·

..

T =¬ ·,

¬ ·

.

¬ ·

···

0 0 0 1

’f0 ’f1 ’f2 · · · ’f ’1

where for i = 1, . . . , ’ 1, the ith row of T contains a 1 in position i + 1,

and is zero everywhere else. This matrix is called the companion matrix

of f . 2

Exercise 15.1. Let F be a ¬nite ¬eld, and let A be a non-zero m—n matrix

over F . Suppose one chooses a vector v ∈ F 1—m at random. Show that the

probability that vA is the zero vector is at most 1/|F |.

Exercise 15.2. Design and analyze a probabilistic algorithm that takes as

input matrices A, B, C ∈ Zm—m , where p is a prime. The algorithm should

p

2 len(p)2 ) and should output either “yes” or “no” so that

run in time O(m

the following holds:

• if C = AB, then the algorithm should always output “yes”;

• if C = AB, then the algorithm should output “no” with probability

at least 0.999.

15.3 The inverse of a matrix 323

15.3 The inverse of a matrix

Let R be a ring.

For a square matrix A ∈ Rn—n , we call a matrix X ∈ Rn—n an inverse of

A if XA = AX = I, where I is the n — n identity matrix.

It is easy to see that if A has an inverse, then the inverse is unique: if X

and Y were inverses, then multiplying the equation I = AY on the left by

X, we obtain X = X(AY ) = (XA)Y = IY = Y .

Because the inverse of A is uniquely determined, we denote it by A’1 .

If A has an inverse, we say that A is invertible, or non-singular. If A

is not invertible, it is sometimes called singular. We will use the terms

“invertible” and “not invertible.” Observe that A is the inverse of A’1 ; that

is, (A’1 )’1 = A.

If A and B are invertible n — n matrices, then so is their product: in fact,

it is easy to see that (AB)’1 = B ’1 A’1 (verify). It follows that if A is an

invertible matrix, and k is a non-negative integer, then Ak is invertible with

inverse (A’1 )k , which we also denote by A’k .

It is also easy to see that A is invertible if and only if the transposed matrix

A is invertible, in which case (A )’1 = (A’1 ) . Indeed, AX = I = XA

holds if and only if X A = I = A X

The following theorem connects invertibility to linear maps.

Theorem 15.3. Let A ∈ Rn—n , and let ρ : R1—n ’ R1—n be the R-linear

map that sends v ∈ R1—n to vA. Then A is invertible if and only if ρ is

bijective.

Proof. Suppose A is invertible, and let X ∈ Rn—n be its inverse. The map ρ

is surjective, since for any w ∈ R1—n , w = wI = wXA = ρ(wX). The map