For the remainder of this section, we set p = 2 and let v be an element of

GF(22m ) such that v + v = 1 with = 2m . Then {v, v } is a normal basis

of GF(22m ) and the coordinate vectors consist of two elements of GF(2m ). We

denote by g the trace of v over GF(2m ):

g = v 2 + v, (1)

and g ∈ GF(2m ). We show now that the use of a normal basis leads to simple

formulas for products and inverses of elements.

2.2 Multiplication

Let (a, b) and (c, d) be the coordinates of two elements of GF(22m ). The coordi-

nates of the product are given by the following formula:

(e, f ) = (a, b) — (c, d) ” ev + f v = acv 2 + (ad + bc)v +1

+ bdv 2

e = (a + b)(c + d)g + ac

”

f = (a + b)(c + d)g + bd

Here the element g is de¬ned by (1).

Using Normal Bases for Compact Hardware Implementations 239

2.3 Inversion

Let (a, b) be the coordinates of an element of GF(22m ). The coordinates of the

inverse element are given by the following formula:

(c, d) = (a, b)’1 ” 1 = (av + bv )(cv + dv )

” 1 = acv 2 + (ad + bc)(v 2 + v) + bd(v 2 + 1)

c = ((a + b)2 g + ab)’1 b

” (2)

d = ((a + b)2 g + ab)’1 a

Again the element g is de¬ned by (1).

3 Implementing the AES Inverse Using Normal Bases

Canright has investigated all 432 possible tower ¬eld decompositions using poly-

nomial and normal bases in his work [3,2]. He has obtained the smallest hardware

implementation of the S-box by using normal bases at all three levels. However,

he did not consider the decomposition of the ¬eld GF(16) into the sub¬eld GF(2)

using normal bases. In this section we derive this decomposition and present for-

mulas for the inversion in GF(16) using the additional normal bases.

3.1 Inversion in GF(256)

Assume a normal basis {v, v 16 }. Equation (2) suggests the following 3-stage

approach to implement the inverse [2,3].

Step 1: Compute the product ab and add the result to (a + b)2 g. Computing

the product is a nonlinear operation and the remaining operations are linear.

Step 2: Compute the multiplicative inverse over GF(16) of the result of Step 1.

Step 3: Multiply the result of Step 2 once with a, and once with b.

Note that Step 3 uses twice the same hardware. Hence, hardware can be saved by

splitting Step 3 up into two steps using the same circuit. The computation of the

multiplicative inverse over GF(16) can be computed by applying recursion, i.e.

describing GF(16) as an extension ¬eld of GF(4) using the tower ¬eld approach.

We have opted for a direct computation of the inverse over GF(16), as explained

in the next section.

3.2 Normal Bases in GF(16)

The ¬eld GF(16) can be described directly as an extension ¬eld of GF(2). This

approach gives a 4-dimensional basis with coordinate elements from GF(2), i.e.

m = 1 and p = 4. The ¬eld GF(16) counts exactly 8 solutions to the following

equation:

θ + θ2 + θ4 + θ8 = 1. (3)

240 S. Nikova, V. Rijmen, and M. Schl¨¬er

a

These 8 elements de¬ne 8 possible normal bases: {θ, θ2 , θ4 , θ8 }. Note that if θ1

2 4 8

satis¬es (3), then so do θ1 , θ1 and θ1 , i.e. they de¬ne rotated versions of the

same 4 base vectors. Thus the number of the normal bases reduces to two, as

noted in [7].

It is called an optimal normal base (ONB) [7], because the complexity of

the multiplication formula in this basis is minimal, i.e. equal to 2n ’ 1 = 7

[4]. In the non-optimal normal basis (NB), the complexity of the multiplication

formula equals 9 [8]. In those two cases multiplying x = (x0 , x1 , x2 , x3 ) with

y = (y0 , y1 , y2 , y3 ) results in z = (z0 , z1 , z2 , z3 ), where

NB: z3 = x2 y3 + x3 y2 + x1 y3 + x3 y1 + x3 y0 + x0 y3 + x2 y2 + x0 y1 + x1 y0 .

ONB: z3 = x3 y1 + x0 y1 + x0 y2 + x1 y3 + x1 y0 + x2 y0 + x2 y2 .

(4)

As noted by Paar [8] the multiplication in any normal basis is rotation symmetric,

so the rest of the output bits z0 , z1 and z2 can be computed by rotating the input

bits. Note that the multiplicative group of GF (16) has order 15 and we know

from group theory that this group is the direct product of two cyclic groups of

order 3 and order 5. Remember that the intersection of these two subgroups is

trivial.

3.3 Inversion in GF(16)

We now study in detail how the choice of θ in¬‚uences the complexity of the map

s(x).

1. Equation (3) can be rewritten as

(θ + θ4 )2 + (θ + θ4 ) = 1.

It follows that for all θ satisfying (3), the elements θ + θ4 and θ2 + θ8 are the

roots of the polynomial x2 + x + 1. The roots of this polynomial are exactly

the two elements of order 3 in GF(16).

2. In a normal base, the zero element of GF(16) always has coordinates 0000,

while the unit element always has 1111. Hence we always have s(0000) =

(0000) and s(1111) = (1111).

3. It follows from Property 2 that x and s(x) must have coordinates with the

same rotational symmetry. Hence, the inverse map must map elements with

rotational symmetry to elements with the same symmetry. Hence we have

the following commutative diagram:

This implies that s({0101, 1010}) = {0101, 1010}. The inversion map has

only two ¬xed points: the zero element and the unit element and it follows

Using Normal Bases for Compact Hardware Implementations 241

that s(0101) = 1010 and s(1010) = 0101: We can conclude that 0101 and

1010 are the two elements of order 2. Together with the unit element, they

form the cyclic subgroup of order 3 of the multiplicative group of GF (16).

4. The remaining 12 elements of GF(16) can be partitioned into 3 sets with 4

elements each. The index of Si denotes the number of ones in each represen-

tation:

S1 = {0001, 0010, 0100, 1000}

S2 = {0011, 0110, 1100, 1001}

S3 = {0111, 1110, 1101, 1011}

Due to the rotational symmetry of s, this partitioning is consistent with s.

If one element of Si is mapped to an element of Sj , then all elements of Si

are mapped to elements of Sj , and since s is an involution, this also implies

that all elements of Sj are mapped to elements of Si .

5. Assume now that s(Si ) = Si . Then the rotational symmetry of s implies:

∃r, ∀v ∈ Si : s(v) = rotr (v).

Since s is an involution we have

rotr (rotr (v)) = rot2r (v) ≡ v.

This implies that 2r = 0 (mod 4). Since we cannot have new ¬xed points

for s, r = 0 we get r = 2. We require that s(S2 ) = S2 because otherwise, we

would get s(0011) = (1100), and the corresponding element satis¬es x’1 +

x = 1. Since this would mean that it is a root of the polynomial x2 + x + 1