<n

into account the zero polynomial, it follows that

n’1

sn

q k un’k ,

q =1+ (19.2)

k=0

which holds for all n ≥ 1. Replacing n by n ’ 1 in (19.2), we obtain

n’2

s(n’1)

q k un’1’k ,

q =1+ (19.3)

k=0

which holds for all n ≥ 2, and indeed, holds for n = 1 as well. Subtracting

q times (19.3) from (19.2), we deduce that for n ≥ 1,

q sn ’ q sn’s+1 = 1 + un ’ q,

and rearranging terms:

un = q sn ’ q sn’s+1 + q ’ 1.

434 Linearly generated sequences and applications

Therefore,

Λm (s) = um /q sm = 1 ’ 1/q s’1 + (q ’ 1)/q sm . 2

F

From the above theorem, it follows that for s ≥ 1, the probability Ps that

Algorithm MP runs for more than s loop iterations is at most 1/q s’1 . If T

is the total number of loop iterations, then

q

1/q s’1 = 1 +

P[T ≥ i] = 1 + Ps ¤ 1 +

E[T ] = = O(1).

q’1

i≥1 s≥1 s≥1

Let us summarize all of the above analysis with the following:

Theorem 19.6. Let S be a sequence of elements of an F -vector space V of

¬nite dimension > 0 over F , where F is a ¬nite ¬eld. Assume that S is

linearly generated over F with minimal polynomial φ ∈ F [X] of degree m, and

that S has full rank (i.e., the ¬rst m elements of S are linearly independent).

Then given an upper bound M on m, along with the ¬rst 2M elements of

S, Algorithm MP correctly computes φ using an expected number of O(M )

operations in F .

We close this section with the following observation. Suppose the se-

quence S is of the form (β, „ (β), „ 2 (β), . . .), where β ∈ V and „ : V ’ V is

an F -linear map. Suppose that with respect to some ordered basis for V , el-

ements of V are represented as elements of F 1— , and elements of DF (V ) are

represented as elements of F —1 . The linear map „ also has a corresponding

representation as a matrix A ∈ F — , so that evaluating „ at a point ± in

V corresponds to multiplying the coordinate vector of ± on the right by A.

Now, suppose β ∈ V has coordinate vector b ∈ F 1— and that π ∈ DF (V )

˜

has coordinate vector c ∈ F —1 . Then if S is the sequence of coordinate

vectors of the elements of S, we have

S = (bAi )∞ and Sπ = (bAi c )∞ .

˜

i=0 i=0

This more concrete, matrix-oriented point of view is sometimes useful; in

particular, it makes quite transparent the symmetry of the roles played by

β and π in forming the projected sequence.

Exercise 19.6. If |F | = q and φ ∈ F [X] is monic and factors into monic

irreducible polynomials in F [X] as φ = pe1 · · · per , show that

r

1

r r

Λφ (1) ’ deg(pi )

q ’ deg(pi ) .

(1 ’ q )≥1’

=

F

i=1 i=1

From this, conclude that the probability that Algorithm MP terminates

19.4 Solving sparse linear systems 435

after just one loop iteration is 1 ’ O(m/q), where m = deg(φ). Thus, if q

is very large relative to m, it is highly likely that Algorithm MP terminates

after just one iteration of the main loop.

19.4 Solving sparse linear systems

Let V be a vector space of ¬nite dimension > 0 over a ¬nite ¬eld F , and

let „ : V ’ V be an F -linear map. The goal of this section is to develop

time- and space-e¬cient algorithms for solving equations of the form

„ (γ) = δ; (19.4)

that is, given „ and δ ∈ V , ¬nd γ ∈ V satisfying (19.4). The algorithms we

develop will have the following properties: they will be probabilistic, and

will use an expected number of O( 2 ) operations in F , an expected number

of O( ) evaluations of „ , and space for O( ) elements of F . By an “evaluation

of „ ,” we mean the computation of „ (±) for some ± ∈ V .

We shall assume that elements of V are represented as coordinate vectors

with respect to some ¬xed ordered basis for V . Now, if the matrix rep-

resenting „ with respect to the given ordered basis is sparse, having, say,

1+o(1) non-zero entries, then the space required to represent „ is 1+o(1)

elements of F , and the time required to evaluate „ is 1+o(1) operations in

F . Under these assumptions, our algorithms to solve (19.4) use an expected

number of 2+o(1) operations in F , and space for 1+o(1) elements of F . This

is to be compared with standard Gaussian elimination: even if the original

matrix is sparse, during the execution of the algorithm, most of the entries

in the matrix may eventually be “¬lled in” with non-zero ¬eld elements,

leading to a running time of „¦( 3 ) operations in F , and a space requirement

of „¦( 2 ) elements of F . Thus, the algorithms presented here will be much

more e¬cient than Gaussian elimination when the matrix representing „ is

sparse.

We hasten to point out that the algorithms presented here may be more

e¬cient than Gaussian elimination in other cases, as well. All that matters

is that „ can be evaluated using o( 2 ) operations in F and/or represented

using space for o( 2 ) elements of F ”in either case, we obtain a time and/or

space improvement over Gaussian elimination. Indeed, there are applica-

tions where the matrix of the linear map „ may not be sparse, but never-

theless has special structure that allows it to be represented and evaluated

in subquadratic time and/or space.

We shall only present algorithms that work in two special, but important,

cases:

436 Linearly generated sequences and applications

• the ¬rst case is where „ is invertible,

• the second case is where „ is not invertible, δ = 0, and a non-zero

solution γ to (19.4) is required (i.e., we are looking for a non-zero

element of ker(„ )).

In both cases, the key will be to use Algorithm MP in §19.3 to ¬nd the

minimal polynomial φ of the linearly generated sequence

(±i = „ i (β), i = 0, 1, . . .),

S := (±0 , ±1 , . . .), (19.5)

where β is a suitably chosen element of V . From the discussion in Exam-

ple 19.3, this sequence has full rank, and so we may use Algorithm MP. We

may use M := as an upper bound on the degree of φ (assuming we know

nothing more about „ and β that would allow us to use a smaller upper

bound). In using Algorithm MP in this application, note that we do not

want to store ±0 , . . . , ±2 ’1 ”if we did, we would not satisfy our stated space

bound. Instead of storing the ±i in a “warehouse,” we use a “just in time”

strategy for computing them, as follows:

• In the body of the main loop of Algorithm MP, where we calculate the

values ai := π(±i ), for i = 0 . . . 2 ’ 1, we perform the computation

as follows:

±←β

for i ← 0 to 2 ’ 1 do

ai ← π(±), ± ← „ (±)

• In the test at the bottom of the main loop of Algorithm MP, if g =

k j

j=0 gj X , we compute ν := g S ∈ V as follows:

ν ← 0, ± ← β