also compute while interacting with a single honest server we conclude that the

protocol suite is “distribution-safe.” Note that distribution-safety as a property

suggests that a single-server functionality is distributed to a set of servers in a

manner that any correctness or security property that the protocol suite may

satisfy is preserved. Distribution safety does not impose by itself any correctness

or security guarrantee on the protocol suite (and these will be argued separately).

We next de¬ne a protocol suite that relates to ElGamal encryption and we will

take advantage of in our construction. We ¬rst describe the encryption function

itself that is loosely based on [29] and then we describe the protocol suite that

we will need. Let G1 be a group of prime order q; the public key includes

G1 , G2 , H, q where G2 , H ∈ G1 . The secret key is the value w = logG1 H. Given

a plaintext M ∈ Zq the encryption is the tuple U1 , U2 , C = Gr , Gr , H r GM

1 2

where r ←R Zq accompanied with a non-interactive proof of knowledge for the

64 A. Kiayias, S. Xu, and M. Yung

statement PK(ρ : U1 = Gρ § U2 = Gρ ); this is a proof of equality of discrete-

1 2

logs that can be made non-interactive following the Fiat-Shamir heuristic [15].

Overall the ciphertext will have the form © = (U1 , U2 , C, π) with π standing for

the non-interactive proof of knowledge. Occasionally, we may use the notation

© —¦ to denote (U1 , U2 , C) and call this the “reduced ciphertext.” This would be

useful in contexts where it is certain that the ciphertext is valid (i.e., the U1 , U2

are properly formed).

We note that we do not require the actual recovery of M (thus encrypting

M

G does not hurt the e¬ciency of decryption); alternatively one can think of

the size of the plaintext space as polynomial in κ (this would be indeed the case

in our setting) and thus the recovery of M is possible through exhaustive search

(or even a baby-step giant-step strategy). Following [29] one can show that the

above cryptosystem is IND-CCA2 in the random oracle model assuming the

Decisional Di¬e Hellman assumption.

We proceed next to de¬ne a protocol suite for the encryption scheme de¬ned

above that is parameterized by two hash functions H, H . Each server maintains

a set QUAL that contains the set of properly acting servers in a sequence of an

execution. The way that the protocols in the suite maintain QUAL will guarantee

that all honest servers maintain the same set and in all cases |QUAL| ≥ t. The

protocols for the servers M1 , . . . , Mm are as follows:

“ ParGen is an m-server protocol with public output de¬ned by the proba-

bilistic function f , where f (1κ ) is a tuple that includes the description of a

group that contains G1 as well as the κ-bit prime q which is the order of G1 .

We assume that this the group is selected from a predetermined table (that

contains one entry for each κ) and thus ParGen is non-interactive.

“ ExpGen is an m-server protocol that using the group description of G1 and

the parameters t, m, it enables the i-th server to compute a public output

(H, H1 , . . . , Ht’1 ) as well as the private output wi where H = H0 = Gw is1

(t,m )

a random element of G1 and it holds that w ←’ (wi1 , . . . , wim ) where

it’1

QUAL = {i1 , . . . , im } ⊆ {1, . . . , m} as well as H0 H1 . . . Ht’1 = Gwi for

i

(t,m)

= 1, . . . , m . The notation w ←’ (w1 , . . . , wm ) means that w1 , . . . , wm is

a secret-sharing of w (cf. [28]) so that using any t out of the m shares it is

possible to reconstruct w but any less than t shares reveal no information

about w.

This protocol can be realized by the distributed key generation DKG pro-

tocol of [18] which builds on [14,27].

We note that the DKG protocol relies on Pedersen commitments which

also require a value F ∈ G1 with unknown discrete-logarithm. Such value

can be calculated by having each server computing F = H(„ ) where „ is a

¬xed string known to all parties (and is unique for each invocation of the

system); we assume that the range of H can be mapped to G1 . Note that

the protocol fails if F = 1 (a negligible probability event).

Privacy Preserving Data Mining within Anonymous Credential Systems 65

“ PkGen2. This is an m-server protocol to compute the value G2 . The calcu-

lation of G2 can be done in the same way as the value F given above but

using the hash function H instead.

* Init : the sequential composition of the protocols (ParGen, ExpGen, PkGen2)

in this order constitutes the initialization of the protocol suite. Subsequent

protocol executions employ the parameters generated by this execution.

“ ExpRecon is an m-server protocol that on input V ∈ G1 it produces the

(t,m )

public output V w where w ←’ (wi1 , . . . , wim ) is the secret-sharing of the

secret-key (committed to H = Gw ) that is held by the QUAL = {i1 , . . . , im }

subset of the m servers.

The protocol is realized as follows: The i-th server broadcasts Vi = V wi

as well as a non-interactive proof of knowledge of the statement PK(± :

jt

j

Vi = V ± § H0 H1 . . . Ht = G± ); this is a proof of equality of discrete-

logarithms that is made non-interactive using the hash function H . Upon

receiving the values Vi1 , . . . , Vim accompanied by the NIZK™s πi1 , . . . , πim ,

each server ¬nds t values Vi for which the proofs are valid, say Λ = i1 , . . . , it

»Λ

and computes l∈Λ Vil l where »Λ , . . . , »Λ are the Lagrange coe¬cients that

1 t

satisfy l∈Λ »l · p(il ) = p(0) for all polynomials p of degree less than t in

Λ

Zq .

If a server ¬nds that the proof πi is not valid, it removes server i from

the set QUAL.

“ DEC is an m-server protocol that on input a ciphertext (U1 , U2 , C, π) it returns

the decryption GM of the ciphertext or the value ⊥ to stand for failure.

Speci¬cally, given a ciphertext, the i-th server checks whether the proof π is

valid; in case the test fails the server outputs ⊥. Otherwise, the servers in

QUAL execute the ExpGen protocol on input U1 to compute U1 = H r andw

subsequently decrypt C by returning C/H r as public output.

“ MIX is an m-server protocol that on input a sequence of ciphertexts©1 , . . . ,©n

it outputs a sequence of ciphertexts ©1 , . . . , ©n such that if L1 , . . . , Ln is

the vector of plaintexts of the given ciphertexts, the output of the protocol

is a vector of ciphertexts whose plaintexts are as follows Lp(1) , . . . , Lp(n)

for some randomly selected permutation p.

The MIX protocol follows a roundtable format: according to a schedule,

—¦ —¦

each server reencrypts and shu¬„es a vector of ciphertexts ©1 , . . . , ©n ; then

it broadcasts the shu¬„ed list together with a proof of a correct shu¬„e that

ensures that all plaintext values have been retained. Note that each server

acts on the output of a previous server according to the schedule. If a server

is found to produce an incorrect shu¬„e, the protocol restarts with the mis-

behaving server removed from QUAL. There are a number of protocols that

are suitable for our setting e.g., [17,26,19]. Below we describe our MIX based

on a shu¬„ing protocol that builds on [19].

The ¬rst server Mj in the schedule will check all NIZK proofs that ac-

company the input vector of ciphertexts. It will then operate on the reduced

—¦ —¦

ciphertexts ©1 , . . . , ©n . We modify the shu¬„e protocol of [19] as follows: in

a ¬rst stage each server will broadcast a commitment to the permutation it

66 A. Kiayias, S. Xu, and M. Yung

will use as well as an NIZK that ensures the commitment is properly formed

(broadcasting this commitment is also part of the shu¬„ing protocol). Once

this stage is completed the servers will execute the shu¬„ing protocol ac-

cording to the schedule adhering to their original commitments and using

the Fiat-Shamir heuristic with the hash function H to make the proof non-

interactive. The parameters for the Pedersen type of commitment as the one

used in [19] can be produced by the m servers by executing the protocol

PkGen2 prior to the execution of the mixing protocol (using ¬xed strings

derived from the „ string each time).

“ CMP is an m-server protocol that on input two ciphertexts ©, © it returns

public output 1 if and only if DEC(©) = DEC(© ) (and 0 otherwise). We

notice that CMP as a protocol relates to a private equality test (or PET), see,

e.g., [16]. However, PET is a two-party protocol where two parties wish to

check whether their private values are equal or not; on the other hand in a

CMP protocol a set of servers operate on two ciphertexts and wish to check

whether the corresponding plaintexts are equal when nobody gets to know

the decryption of the ciphertexts.

We present two solutions to the above CMP protocol problem that, depend-

ing on the network connectivity between the servers either one can be more

suitable. The basic idea underlying them is the following observation: Let © =

U1 , U2 , C, π and © = U1 , U2 , C , π , and de¬ne Ψ —¦ = U1 /U1 , U2 /U2 , C/C .

Observe that, assuming ©, © were valid ciphertexts encrypting GM , GM re-

spectively, Ψ —¦ is a valid reduced ciphertext for the value GM’M . It follows that