rithm whose operation count is O( 1.5 ). If we perform two levels of

recursion, this is reduced to O( 1.25 ). For practical purposes, this is

probably enough; however, to get an asymptotically better complex-

ity bound, we can let the algorithm recurse all the way down to inputs

of some (appropriately chosen) constant length. Show that if we do

18.6 Faster polynomial arithmetic (—) 419

this, the operation count of the recursive algorithm is O( len( )β ) for

some constant β (whose value depends on ±1 and ±2 ).

The approach used in the previous exercise was a bit sloppy. With a bit

more care, one can use the same ideas to get an algorithm that multiplies

polynomials over R of length at most using O( len( ) len(len( ))) opera-

tions in R, assuming 2R ∈ R— . The next exercise applies similar ideas, but

with a few twists, to the problem of integer multiplication.

Exercise 18.27. This exercise uses the FFT to develop a linear-time al-

gorithm for integer multiplication; however, a rigorous analysis depends on

an unproven conjecture (which follows from a generalization of the Riemann

hypothesis). Suppose we want to multiply two -bit, positive integers a and b

(represented internally using the data structure described in §3.3). Through-

out this exercise, assume that all computations are done on a RAM, and

that arithmetic on integers of length O(len( )) takes time O(1). Let k be an

integer parameter with k = ˜(len( )), and let m := /k . We may write

a = m’1 ai 2ki and b = m’1 bi 2ki , where 0 ¤ ai < 2k and 0 ¤ bi < 2k .

i=0 i=0

Let n be the integer determined by 2m ¤ 2n < 4m.

(a) Assuming Conjecture 5.24 (and the result of Exercise 5.22), and as-

suming a deterministic, polynomial-time primality test (such as the

one to be presented in Chapter 22), show how to e¬ciently generate

a prime p ≡ 1 (mod 2n ) and an element ω ∈ Z— of multiplicative

p

n , such that

order 2

22k m < p ¤ O(1)

.

Your algorithm should be probabilistic, and run in expected time

polynomial in len( ).

(b) Assuming you have computed p and ω as in part (a), let a := ¯

m’1 m’1

i ∈ Z [X] and ¯ := i ∈ Z [X], and show how

i=0 [ai ]p X b i=0 [bi ]p X

p p

¯ ∈ Zp [X] in time O( ) using the FFT (over Zp ).

to compute c := ab

¯ ¯

Here, you may store elements of Zp in single memory cells, so that

operations in Zp take time O(1).

(c) Assuming you have computed c ∈ Zp [X] as in part (b), show how to

¯

obtain c := ab in time O( ).

(d) Conclude that assuming Conjecture 5.24, we can multiply two -bit

integers on a RAM in time O( ).

Note that even if one objects to our accounting practices, and insists on

charging O(len( )2 ) time units for arithmetic on numbers of length O(len( )),

420 Polynomial arithmetic and applications

the algorithm in the previous exercise runs in time O( len( )2 ), which is

“almost” linear time.

Exercise 18.28. Continuing with the previous exercise:

(a) Show how the algorithm presented there can be implemented on a

RAM that has only built-in addition, subtraction, and branching

instructions, but no multiplication or division instructions, and still

run in time O( ). Also, memory cells should store numbers of length

at most len( ) + O(1). Hint: represent elements of Zp as sequences

of base-2t digits, where t ≈ ± len( ) for some constant ± < 1; use

table lookup to multiply t-bit numbers, and to perform 2t-by-t-bit

divisions”for ± su¬ciently small, you can build these tables in time

o( ).

(b) Using Theorem 5.25, show how to make this algorithm fully deter-

ministic and rigorous, provided that on inputs of length , it is pro-

vided with a certain bit string σ of length O(len( )) (this is called a

non-uniform algorithm).

Exercise 18.29. This exercise shows how the algorithm in Exercise 18.27

can be made quite concrete, and fairly practical, as well.

(a) The number p := 259 27 + 1 is a 64-bit prime. Show how to use this

value of p in conjunction with the algorithm in Exercise 18.27 with

k = 20 and any value of up to 227 .

(b) The numbers p1 := 230 3 + 1, p2 := 228 13 + 1, and p3 := 227 29 + 1

are 32-bit primes. Show how to use the Chinese remainder theorem

to modify the algorithm in Exercise 18.27, so that it uses the three

primes p1 , p2 , p3 , and so that it works with k = 32 and any value of

up to 231 . This variant may be quite practical on a 32-bit machine

with built-in instructions for 32-bit multiplication and 64-by-32-bit

division.

The previous three exercises indicate that we can multiply integers in

essentially linear time, both in theory and in practice. As mentioned in §3.6,

there is a di¬erent, fully deterministic and rigorously analyzed algorithm

that multiplies integers in linear time on a RAM. In fact, that algorithm

works on a very restricted type of machine called a “pointer machine,” which

can be simulated in “real time” on a RAM with a very restricted instruction

set (including the type in the previous exercise). That algorithm works with

¬nite approximations to complex roots of unity, rather than roots of unity

in a ¬nite ¬eld.

18.7 Notes 421

We close this section with a cute application of fast polynomial multipli-

cation to the problem of factoring integers.

Exercise 18.30. Let n be a large, positive integer. We can factor n using

trial division in time n1/2+o(1) ; however, using fast polynomial arithmetic

in Zn [X], one can get a simple, deterministic, and rigorous algorithm that

factors n in time n1/4+o(1) . Note that all of the factoring algorithms dis-

cussed in Chapter 16, while faster, are either probabilistic, or deterministic

but heuristic. Assume that we can multiply polynomials in Zn [X] of length

at most using M ( ) operations in Zn , where M is a well-behaved complex-

ity function, and M ( ) = 1+o(1) (the algorithm from Exercise 18.26 would

su¬ce).

(a) Let be a positive integer, and for i = 1, . . . , , let

’1

(i ’ j) mod n.

ai :=

j=0

Using fast polynomial arithmetic, show how to compute all of the

integers a1 , . . . , a in time 1+o(1) len(n)O(1) .

(b) Using the result of part (a), show how to factor n in time n1/4+o(1)

using a deterministic algorithm.

18.7 Notes

Exercise 18.4 is based on an algorithm of Brent and Kung [20]. Using fast

matrix arithmetic, Brent and Kung show how this problem can be solved

using O( (ω+1)/2 ) operations in R, where ω is the exponent for matrix mul-

tiplication (see §15.6), and so (ω + 1)/2 < 1.7.

The interpretation of Lagrange interpolation as “secret sharing” (see

§18.4.2), and its application to cryptography, was made by Shamir [85].

Reed“Solomon codes were ¬rst proposed by Reed and Solomon [77], al-

though the decoder presented here was developed later. Theorem 18.7 was

proved by Mills [64]. The Reed“Solomon code is just one way of detecting

and correcting errors ”we have barely scratched the surface of this subject.

Just as in the case of integer arithmetic, the basic “pencil and paper”

quadratic-time algorithms discussed in this chapter for polynomial arith-

metic are not the best possible. The fastest known algorithms for multipli-

cation of polynomials of length over a ring R take O( len( ) len(len( )))

operations in R. These algorithms are all variations on the basic FFT al-

gorithm (see Exercise 18.25), but work without assuming that 2R ∈ R— or

422 Polynomial arithmetic and applications

that R contains any particular primitive roots of unity (we developed some

of the ideas in Exercise 18.26). The Euclidean and extended Euclidean al-

gorithms for polynomials over a ¬eld F can be implemented so as to take

O( len( )2 len(len( ))) operations in F , as can the algorithms for Chinese

remaindering and rational function reconstruction. See the book by von

zur Gathen and Gerhard [37] for details (as well for an analysis of the Eu-