The quadratic-time algorithms presented in §3.3 for integer multiplication

and division are by no means the fastest possible. The next exercise develops

a faster multiplication algorithm.

Exercise 3.32. Suppose we have two positive, -bit integers a and b such

that a = a1 2k + a0 and b = b1 2k + b0 , where 0 ¤ a0 < 2k and 0 ¤ b0 < 2k .

Then

ab = a1 b1 22k + (a0 b1 + a1 b0 )2k + a0 b0 .

Show how to compute the product ab in time O( ), given the products a0 b0 ,

a1 b1 , and (a0 ’ a1 )(b0 ’ b1 ). From this, design a recursive algorithm that

computes ab in time O( log2 3 ). (Note that log2 3 ≈ 1.58.)

The algorithm in the previous is also not the best possible. In fact, it is

possible to multiply -bit integers on a RAM in time O( ), but we do not

explore this any further here (see §3.6).

The following exercises explore the relationship between integer multipli-

cation and related problems. We assume that we have an algorithm that

multiplies two integers of at most bits in time M ( ). It is convenient (and

reasonable) to assume that M is a well-behaved complexity function.

By this, we mean that M maps positive integers to positive real numbers,

and

• for all positive integers a and b, we have M (a + b) ≥ M (a) + M (b),

and

• for all real c > 1 there exists real d > 1, such that for all positive

integers a and b, if a ¤ cb, then M (a) ¤ dM (b).

Exercise 3.33. Let ± > 0, β ≥ 1, γ ≥ 0, δ ≥ 0 be real constants. Show

that

β

len( )γ len(len( ))δ

M ( ) := ±

is a well-behaved complexity function.

Exercise 3.34. Give an algorithm for Exercise 3.22 that runs in time

O(M (len(n)) len(k)).

Hint: divide and conquer.

Exercise 3.35. We can represent a “¬‚oating point” number z as a pair

ˆ

(a, e), where a and e are integers ” the value of z is the rational number

ˆ

52 Computing with large integers

a2e , and we call len(a) the precision of z . We say that z is a k-bit ap-

ˆ ˆ

proximation of a real number z if z has precision k and z = (1 + )z for

ˆ ˆ

some | | ¤ 2’k+1 . Show how to compute ” given positive integers b and

k ” a k-bit approximation of 1/b in time O(M (k)). Hint: using Newton

iteration, show how to go from a t-bit approximation of 1/b to a (2t ’ 2)-

bit approximation of 1/b, making use of just the high-order O(t) bits of b,

in time O(M (t)). Newton iteration is a general method of iteratively

approximating a root of an equation f (x) = 0 by starting with an initial ap-

proximation x0 , and computing subsequent approximations by the formula

xi+1 = xi ’ f (xi )/f (xi ), where f (x) is the derivative of f (x). For this

exercise, apply Newton iteration to the function f (x) = x’1 ’ b.

Exercise 3.36. Using the result of the previous exercise, given positive

integers a and b of bit length at most , show how to compute a/b and

a mod b in time O(M ( )). From this, we see that up to a constant factor,

division with remainder is no harder that multiplication.

Exercise 3.37. Using the result of the previous exercise, give an algorithm

for Exercise 3.23 that runs in time O(M (len(n)) len(k)). Hint: divide and

conquer.

Exercise 3.38. Give an algorithm for Exercise 3.24 that runs in time

O(M (len(n))). Hint: Newton iteration.

Exercise 3.39. Give algorithms for Exercise 3.25 that run in time

O(M ( ) len( )), where := len(n). Hint: divide and conquer.

Exercise 3.40. Suppose we have an algorithm that computes the square of

an -bit integer in time S( ), where S is a well-behaved complexity function.

Show how to use this algorithm to compute the product of two arbitrary

integers of at most bits in time O(S( )).

3.6 Notes

Shamir [84] shows how to factor an integer in polynomial time on a RAM,

but where the numbers stored in the memory cells may have exponentially

many bits. As there is no known polynomial-time factoring algorithm on

any realistic machine, Shamir™s algorithm demonstrates the importance of

restricting the sizes of numbers stored in the memory cells of our RAMs to

keep our formal model realistic.

The most practical implementations of algorithms for arithmetic on large

3.6 Notes 53

integers are written in low-level “assembly language,” speci¬c to a partic-

ular machine™s architecture (e.g., the GNU Multi-Precision library GMP,

available at www.swox.com/gmp). Besides the general fact that such hand-

crafted code is more e¬cient than that produced by a compiler, there is

another, more important reason for using such code. A typical 32-bit ma-

chine often comes with instructions that allow one to compute the 64-bit

product of two 32-bit integers, and similarly, instructions to divide a 64-bit

integer by a 32-bit integer (obtaining both the quotient and remainder).

However, high-level programming languages do not (as a rule) provide any

access to these low-level instructions. Indeed, we suggested in §3.3 using a

value for the base B of about half the word-size of the machine, so as to

avoid over¬‚ow. However, if one codes in assembly language, one can take B

to be much closer to, or even equal to, the word-size of the machine. Since

our basic algorithms for multiplication and division run in time quadratic

in the number of base-B digits, the e¬ect of doubling the bit-length of B is

to decrease the running time of these algorithms by a factor of four. This

e¬ect, combined with the improvements one might typically expect from us-

ing assembly-language code, can easily lead to a ¬ve- to ten-fold decrease in

the running time, compared to an implementation in a high-level language.

This is, of course, a signi¬cant improvement for those interested in serious

“number crunching.”

The “classical,” quadratic-time algorithms presented here for integer mul-

tiplication and division are by no means the best possible: there are algo-

rithms that are asymptotically faster. We saw this in the algorithm in

Exercise 3.32, which was originally invented by Karatsuba [52] (although

Karatsuba is one of two authors on this paper, the paper gives exclusive

credit for this particular result to Karatsuba). That algorithm allows us to

multiply two -bit integers in time O( log2 3 ). The fastest known algorithm

for multiplying two -bit integers on a RAM runs in time O( ). This algo-

rithm is due to Sch¨nhage, and actually works on a very restricted type of

o

RAM called a “pointer machine” (see Problem 12, Section 4.3.3 of Knuth

[54]). See Exercise 18.27 later in this text for a much simpler (but heuristic)

O( ) multiplication algorithm.

Another model of computation is that of Boolean circuits. In this model

of computation, one considers families of Boolean circuits (with, say, the

usual “and,” “or,” and “not” gates) that compute a particular function ”

for every input length, there is a di¬erent circuit in the family that computes

the function on inputs of that length. One natural notion of complexity for

such circuit families is the size of the circuit (i.e., the number of gates and

54 Computing with large integers

wires in the circuit), which is measured as a function of the input length.

The smallest known Boolean circuit that multiplies two -bit numbers has

size O( len( ) len(len( ))). This result is due to Sch¨nhage and Strassen [82].

o

It is hard to say which model of computation, the RAM or circuits, is

“better.” On the one hand, the RAM very naturally models computers as

we know them today: one stores small numbers, like array indices, coun-

ters, and pointers, in individual words of the machine, and processing such

a number typically takes a single “machine cycle.” On the other hand, the

RAM model, as we formally de¬ned it, invites a certain kind of “cheating,”

as it allows one to stu¬ O(len( ))-bit integers into memory cells. For exam-

ple, even with the simple, quadratic-time algorithms for integer arithmetic

discussed in §3.3, we can choose the base B to have len( ) bits, in which

case these algorithms would run in time O(( / len( ))2 ). However, just to

keep things simple, we have chosen to view B as a constant (from a formal,

asymptotic point of view).

In the remainder of this text, unless otherwise speci¬ed, we shall always

use the classical O( 2 ) bounds for integer multiplication and division, which

have the advantage of being both simple and reasonably reliable predictors of