that some memory is available for caching at least one group element with an

integer. Such data will be put into cache memory when performing a multi-

e

exponentiation 1¤i¤k gi i involving a “new” g1 (i.e., one for which no cache

entry currently exists), and can be used to speed up any subsequent multi-

exponentiation repeating the same g1 as long as the cache entry is kept. While

the method is easier to describe assuming that dedicated cache memory is avail-

able, Appendix C will show that the technique can be quite useful even if this

is not the case and a portion of fast read/write memory has to be sacri¬ced

instead: In a speci¬c example scenario where read/write memory is very scarce

(which is typical of smart cards and other embedded devices), our technique

already leads to a 10 % average speed advantage. The technique is also useful

for devices without such constraints; the speci¬c speed advantage factor gained

by using our method strongly depends on the concrete application scenario and

can be signi¬cantly higher than the 10 % in said example.

Like many approaches for exponentiation using precomputation (such as the

Lim/Lee approach [12]), our technique has roots that can be traced back to

Pippenger [21,22]; see also [4]. The novelty here in this regard is that for cached

2n

precomputation, we do not store powers of the form g1 as by [21], which would

42 B. M¨ller and A. Rupp

o

e

impose some computational overhead while computing 1¤i¤k gi i when g1 is

new. Instead, we store other powers of g1 that happen to come up without e¬ort

if we arrange the computation suitably.

Section 2 describes preliminaries for our technique: interleaved multi-

exponentiation, and radix-2 exponent splitting. Then, Section 3 presents the

novel multi-exponentiation technique, which relies on caching certain interme-

diate results that can be obtained by investing very little additional read/write

memory or time, allowing us to speed up later multi-exponentiations if an

appropriate cache entry is available. Section 4 gives example performance

¬gures for the new technique in certain application scenarios. Appendix A

provides a comprehensive example to illustrate the technique. Appendix B dis-

cusses some implementation aspects. Finally, Appendix C considers a particu-

lar scenario to demonstrate the performance gain achievable by using the new

technique.

2 Multi-exponentiation

We show known techniques that we later will use and combine in a novel way.

Section 2.1 describes interleaved multi-exponentiation, an approach for com-

puting power products. It also brie¬‚y describes some properties of radix-2 ex-

ponent representations that can be used in interleaved multi-exponentiation.

Section 2.2 describes the technique of radix-2 exponent splitting, which can be

used to obtain shorter exponents by converting exponentiation tasks into multi-

exponentiation tasks, or converting k-fold multi-exponentiation tasks into k -fold

multi-exponentiation tasks with k > k. Radix-2 exponent splitting is a useful

technique for ¬xed bases (namely, for exponentiation or multi-exponentiation

with precomputation that can be done in advance).

2.1 Interleaved Multi-exponentiation

We build on the straightforward multi-exponentiation strategy that has been

called interleaving in [14], which generalizes well-known methods for single expo-

nentiations such as the (left-to-right) binary or sliding window methods. Assume

that radix-2 representations ei = 0¤j¤ bi,j · 2j , bi,j ∈ Bi , of all exponents are

given where each Bi is a set of integers. We write ei = (bi, , bi, ’1 , . . ., bi,1 , bi,0 )2

and call the bi,j digits and Bi a digit set. We require that every gi for b ∈ Bi \{0}

b

be available from precomputed data. Note that in a group where inversion is

particularly easy (such as those used in elliptic curve cryptography where an

inversion requires just obtaining an additive inverse in the underlying ¬eld or

’b b

performing a ¬eld addition), obtaining gi from precomputed data is easy if gi

has been stored; so both b and ’b can be included in set Bi if the single element

b

gi has been stored in a precomputed table of powers. In this setting, interleaved

multi-exponentiation computes the power product as follows.

Faster Multi-exponentiation through Caching 43

A ← 1G {Start with identity element}

for j = down to 0 do

A ← A2

for i = 1 to k do

if bi,j = 0 then

b

A ← A · gi i,j {Multiply by [inverse of] precomputed element}

return A

This is a left-to-right technique in that it proceeds from the most signi¬cant

digits (“left”) down to the least signi¬cant digits (“right”).

Typical digits sets are of the form B ± (m) = {±1, ±3, ±5, ±7, . . ., ±m, 0} for

groups where inversion is easy, or B(m) = {1, 3, 5, 7, . . ., m, 0} for semigroups

in general. Here parameter m is an odd integer, often but not necessarily of

the form (11. . .11)2 , i.e. m = 2w ’ 1, w ≥ 1 an integer. This special form

applies to the sliding window technique (cf. [9]) and to various variants of it that

employ signed-digit representations of exponents, such as those introduced in [13]

using a right-to-left conversion from binary to signed-digit representation and in

[17,3,20] using left-to-right conversions. The general case with an arbitrary odd

m was introduced as fractional window representations in [15], with left-to-right

conversions for the signed-digit case suggested in [11,23,16]. Di¬erent digits sets

can be used for di¬erent exponents, so we have Bi = B(mi ) or Bi = B ± (mi )

with per-exponent parameters mi when employing such representations.

The length of a representation is the number of digits that remain if we drop

any leading zeros (so the length of (b , b ’1 , . . ., b1 , b0 )2 is + 1 if b = 0). Max-

imum length l + 1 is su¬cient to represent any l-bit integer e (2l’1 ¤ e < 2l )

in any of the representations mentioned above [18] (length l is su¬cient for any

of the unsigned-digit representations), and the minimum length with these rep-

resentations is l + 1 ’ log2 m . Thus, the maximum outer loop index in the

algorithm as shown above is su¬cient for integers up to bits.

The weight of a representation is the number of digits that are non-zero. The

conversion techniques mentioned above are known to achieve, for any integer e,

the minimum weight possible given the respective digit set [18,16]. For unsigned

and signed fractional window representations using digit set {1, 3, 5, . . ., m, 0} or

{±1, ±3, ±5, . . ., ±m, 0}, the average weight for random integers up to bits is

slightly above

and ,

m+1 m+1

w(m) + w(m) 1 + w(m) + w(m)

2 2

respectively, where w(m) = log2 (m + 1) ; for the average density (weight di-

vided by ), we have convergence to the resulting estimates as goes to ∞

(see [16]). For the special case m = 2w ’ 1 (i.e., the sliding window technique

and its non-fractional signed-digit counterparts), such that w(m) = w, the above

is simply /(1 + w) and /(2 + w), respectively.

Observe that in the interleaved multi-exponentiation algorithm as shown

above, (semi-)group operations need not actually be performed until after the

44 B. M¨ller and A. Rupp

o

¬rst multiplication of A by a precomputed element or its inverse, since A = 1G

holds up to this point. This means that the initial squarings of 1G can be skipped,

b b

and the ¬rst operation A ← A · gi i,j amounts to an assignment A ← gi i,j .

To estimate the time required to perform an interleaved multi-exponentiation,

we thus need the maximum length of the representations of the ei to determine

the number of squarings, and the weight of the representation of each ei to deter-

mine the number of other multiplications by elements available from precomputed

data. (The maximum length is one more than the number of squarings, and the

sum of the weights is one more than the number of other multiplications.) This is

not counting any group inversions, since we would only use these in the algorithm

if inversion is easy. In addition to this, we have to look at the time needed for pre-

b

computation. If gi is a ¬xed base, we can assume that the required powers gi have

been precomputed in advance (and possibly built into ROM) and thus do not enter

the time estimate. However, if gi is not ¬xed, some e¬ort goes into precomputing

m

3

these powers: from gi , the powers gi , . . ., gi i can be computed using one squaring

(to obtain gi as an intermediate value) and mi2’1 other multiplications.

2

(Note that the minimum-weight property of a conversion technique does not

mean that it always provides the best radix-2 representation possible given the

particular digit set. As discussed in [15, Section 5.1] and [16, Section 4], modi¬ed

signed fractional window representations can sometimes reduce length without