Information set decoding attacks. Information set decoding attacks could

be tempted through two di¬erent strategies. The ¬rst one consists in Lee and

Brickell™s method [3], that, however, would require 294 operations.

Alternatively, the vector of intentional errors e could be searched as the lowest

G

weight codeword in the extended code generated by G = . With the new

x

choice of the parameters, however, this search would be very involved. By using

the Stern algorithm, for example, it would require more than 282 operations. For

the ¬rst variant, instead, a similar search would require 271 operations.

At the current stage of cryptanalysis, these attacks achieve the minimum work

factor, that can be hence considered as the security level of each cryptosystem

variant.

Attacks to the dual code. When the sparse structure of the private H is

not su¬ciently hidden, the eavesdropper could recover it by searching for low

weight codewords in the dual of the public code, that admits G has parity-check

matrix. In this version of the cryptosystem, however, the dual of the public code

does not contain low weight codewords, due to the e¬ect of matrix Q on the

rows of the private matrix H.

In the present case, matrix Q has row/column weight 11, so the dual of the

public code has codewords with weight ¤ n0 ·dv ·m = 429. Due to the fact that the

rows of H are sparse, it is highly probable that the minimum weight codewords

in the dual of the public code have weight close to n0 · dv · m. However, even a

lower weight would su¬ce to avoid attacks to the dual code. For example, if we

suppose the existence of p = 8192 codewords with weight 150 in the dual of the

public code (that has length n = 24576 and dimension p = 8192), searching for

one of them through the Stern algorithm would require more than 292 operations.

4 Fast Computations with Circulant Matrices

By exploiting the isomorphism described in Section 2, between the algebra of p—p

binary circulant matrices and the ring of polynomials R = GF (2)[x]/(xp + 1),

computing the determinant of S and Q to check their invertibility, and computing

the k0 — n0 public matrix G = S’1 GQ’1 , require only a few tens of operations

in the ring R. Anyway, the key generation process is not the critical one for

e¬ciency, so we will focus complexity analysis on the vector-matrix products

used for encryption and decryption.

254 M. Baldi, M. Bodrato, and F. Chiaraluce

4.1 Vector-Matrix Product

The isomorphism extends also to vector-matrix product. Let us suppose to have

a vector u = (u0 , u1 , . . . , up’1 ), and a circulant p — p matrix A mapped to

the polynomial a(x) ∈ R by the isomorphism. The product w = u · A =

(w0 , w1 , . . . , wp’1 ) will then be the vector whose components satisfy the equa-

p’1 p’1

tion i=0 wi xi ≡ ( i=0 ui xi ) · a(x) mod (xp + 1). This vector-matrix product

computation can be accelerated basically with two possible strategies: using fast

polynomial multiplication algorithms based on evaluation-interpolation strate-

gies, or using optimized vector-matrix product exploiting the Toeplitz structure.

To compare the methods, we will count the total number of bit operations

needed. We will consider, for the na¨ implementation, a number of operations

±ve

given by the number of non-zero entries in the matrix. For the dense scenario

we will consider that half of the entries are non-null. The starting value is hence

p2 /2 operations. The two phases we will focus on are:

“ The product u · G used for encryption, where u is the message, seen as a

vector of k0 elements in R, and G is the public k0 — n0 matrix with entries in

R. The cost with the na¨ estimate is p2 k0 n0 /2 operations.

±ve

“ The last step of decryption, that is, the product u · S = u, where u is again

a k0 vector in R, and S is a k0 — k0 invertible matrix. The na¨ cost for this

±ve

22

step is p k0 /2.

5 Fast Polynomial Product

All the algorithms for fast polynomial multiplication are based on the same

scheme: evaluation, point-wise multiplication, interpolation. The ¬rst strategy

of this kind was proposed by Karatsuba [17] and then generalized by Toom and

Cook [18,19]. We will call both of them Toom-s, with s the splitting order [20].

Other asymptotically faster algorithms exist for GF (2); the most interesting

ones are due to Cantor and Sch¨nhage [21,22]. Another approach is the use

o

of segmentation, also known as Kronecker-Sch¨nhage™s trick, but the threshold

o

between Toom-Cook and all these methods is far above 100 000 bits [23].

5.1 General Toom-Cook Approach

The Toom-s algorithm for polynomial product requires ¬ve steps:

“ Splitting: The two operands are represented by two polynomials (f and g)

with s coe¬cients.

“ Evaluation: f and g are evaluated in 2s ’ 1 points.

“ Point-wise multiplication: Computed evaluations are multiplied to ob-

tain evaluations of the product, for example (f · g)(0) = f (0) · g(0).

“ Interpolation: Once the values of the product f · g in 2s ’ 1 points are

known, the coe¬cients are obtained via interpolation.

“ Recomposition: The coe¬cients of the result are combined to obtain the

product of the original operands.

A New Analysis of the McEliece Cryptosystem Based on QC-LDPC Codes 255

Starting from [24], where the Toom-2,3,4 algorithms in GF (2) were described

in full details, we can estimate the cost of any one of the steps. Each product

of two polynomials would require the cost for evaluation Ce counted twice, the

cost for point-wise multiplication Cm , plus the cost for interpolation and recom-

position Ci . Since we are dealing with vector-matrix products, where the matrix

is ¬xed a priori, we can assume that all evaluations for the ¬xed operand are

pre-computed.

Moreover, when multiplying a k0 vector by a k0 — n0 matrix, we can reduce

the count to the strictly needed operations. We assume a pre-computation for

all the evaluations of the matrix entries, and we need to evaluate the vector

components only once: so we will count only k0 evaluations. After the n0 k0

point-wise multiplications, we combine evaluated products fi gi (±), fj gj (±), . . .

to obtain evaluations for the results like

(fi gi + fj gj + · · · )(±) ,

then we interpolate only the n0 result polynomials: so we count only n0 inter-

polations and recompositions.

5.2 Toom-2, also Known as Karatsuba

The Toom-2 algorithm splits the operands in two parts. Starting from two p-bits

long polynomials we operate on p/2 -bits long parts.

Evaluation requires an addition of two parts. The 3 point-wise multiplication

operates on parts, and gives doubled parts for results. Interpolation requires

5/2 additions on such doubled parts. Since an addition of two polynomials in

GF (2)[x] with degree d requires d operations, we can conclude that the use of

Karatsuba to multiply a degree p polynomial by a ¬xed one requires:

“ p/2 operations for the evaluation,

“ 3 multiplications of polynomials with degree p/2 ,

“ 5 p/2 operations for the interpolation.

5.3 Cost of Exact Divisions

All the Toom-s with splitting order s > 2 require some exact divisions. We need

to evaluate the cost of this kind of operation.

First of all, we highlight the fact that all the divisions needed for Toom-3

and Toom-4 in GF (2)[x] are divisions by binomials of the form xw + 1, with

w ∈ {1, 2, 3}. Moreover, all the divisions in the Toom-Cook algorithm are exact

divisions [24] and, therefore, can be computed in linear time [25]. When the

divisor is in the very special form xw + 1, exact division can be obtained in-place

with the following simple code.

Input: the degree d, the vector (a0 , . . . , ad ) of coe¬cients of the polynomial

d

A = i=0 ai xi , the divisor in the form D = xw + 1.

Output: the overwritten vector (a0 , . . . , ad’w ) of coe¬cients of the new poly-

d’w

nomial A/D = i=0 ai xi .

Execution: for i = 0 . . . (d ’ w), ai+w ← ai+w + ai .

256 M. Baldi, M. Bodrato, and F. Chiaraluce

So, a little bit less than d operations are required for any exact division needed

in Toom-3 and Toom-4 on GF (2). Since the division process cannot be paral-

lelized as the addition does, we add an extra weight and we double the cost:

we consider 2d bit operations for each exact division. The given algorithm over-

writes its input, so it should be slightly modi¬ed for general use; however, all the

algorithms presented in [24], and proposed here, works with in-place operations.

Other operations used in Toom-Cook are bit-shifts, but they can be avoided