tested without leaking substantial information about M, M .

Roundtable protocol for CMP. The ¬rst solution is suitable for settings

where the servers prefer to minimize broadcasting. It includes the following

stages:

1. Each server Mj selects a random value aj selected from Z— and broadcasts

q

a commitment to aj denoted by ψ = C(aj ) as well as a NIZK proof that the

commitment is well-formed.

2. Given the ciphertexts ©, © , the server Mj where j is the smallest value in

r’s

QUAL, “random scales” the ciphertext, an operation denoted by Ψ —¦ ’’

Ψ —¦ , that proceeds as follows: the server computes Ψ —¦ = (V1 , V2 , D) =

(U1 j , U2 j , C aj ) where Ψ —¦ = (U1 , U2 , C) and aj is the value committed in

a a

stage 1. The server broadcasts Ψ —¦ . Observe that after this operation is exe-

cuted by the ¬rst server the resulting ciphertext © —¦ is either an encryption of

G0 (if M1 = M2 ) or a valid ciphertext of the plaintext Gaj (M1 ’M2 ) . Moreover,

if M1 = M2 , then aj (M1 ’ M2 ) is uniformly distributed over Z— . Finally Mj

q

computes a NIZK proof PK(± : V1 = U1 § V2 = U2 § D = C ± § ψ = C(aj ))

± ±

based on the hash function H . The next server in QUAL collects (V1 , V2 , D),

veri¬es the proof and repeats the process. If a server produces an invalid proof

it is removed from QUAL and the protocol is restarted.

Privacy Preserving Data Mining within Anonymous Credential Systems 67

3. After at least t of the participating servers execute step 2 (this will be guar-

ranteed by the assumption that m ≥ 2t ’ 1 and the fact that the adversary

controls at most t ’ 1 servers), the servers enter the third stage of the proto-

col: if Ψ —¦ is the ¬nal result from stage 2, the servers execute the protocol DEC

on Ψ —¦ (omitting the part where the proof is being checked). The servers con-

clude by returning “1” if the decryption of Ψ —¦ results in 1 (i.e., G0 mod P )

and “0” otherwise.

Threshold protocol for CMP. The second protocol solution to CMP is more

suitable for cases where there is a great number of servers and broadcasting is

an inexpensive operation (in this setting the roundtable approach of the ¬rst

solution may be ine¬cient). It will be broken into the following stages:

1. The servers execute ExpGen on group G1 to produce Z0 , Z1 , . . . , Zt’1 ∈

(t,m)

G1 ; recall that this results in Z = Z0 = Gz such that z ←’ (zi1 , . . . , zim )

for some subset QUAL = {i1 , . . . , im } of QUAL. The internal state of each

server contains now the share zi .

2. The servers in QUAL as determined from the previous stage, execute three

instances of the protocol ExpRecon on input U1 , U2 , C respectively, where

Ψ —¦ = (U1 , U2 , C). This results in the scaled ciphertext Ψ —¦ = (V1 , V2 , D) =

(U1 , U2 , C z ).

z z

3. The servers execute the protocol DEC on Ψ —¦ (omitting the part where the

proof is being checked). The servers conclude by returning “1” if the decryp-

tion of Ψ —¦ results in 1 (i.e., G0 mod P ) and “0” otherwise.

We conclude the section by showing that the protocol suite we de¬ned above

is t-distribution-safe:

Theorem 1. The suite of m-server protocols ParGen, ExpGen, PkGen2, DEC, MIX,

CMP described above is t-distribution-safe assuming the discrete-logarithm as-

sumption and that H is a random oracle controlled by the simulator, for m ≥

2t ’ 1 servers.

3 Data Mining Group Signatures (DMGS): Model

We now de¬ne “data mining group signatures” (DMGS), that extend the notion

of group signatures with a MINING code snippet: a distributed algorithm that can

be executed by a quorum of mining servers and will be based on the ciphertext

manipulations and operations of the previous section; the code computes a given

usage characteristic (the data mining objective of the system). The participants

involved in the system are the users/signers, the DMGS manager (i.e., credential

issuer) that is denoted by DMGM, and the data mining servers M1 , . . . , Mm .

The adversary we will assume will be t-threshold meaning that it can corrupt

at most t ’ 1 mining servers. While the corruptions of the mining servers will

be assumed to be static in our security modeling the adversary can adaptively

corrupt the group members.

68 A. Kiayias, S. Xu, and M. Yung

De¬nition 2. (DMGS) A data-mining group-signature scheme is comprised of:

1. Setup: This stage consists of two parts.

(a) KeyGDMGM : On input a security parameter κ and an upper bound on

the number of users n, this probabilistic algorithm outputs the data min-

ing group signature manager™s public key YDMGM (including all system

parameters), the public user database, and the secret keys of all users

sk1 , . . . , skn . The secret keys sk1 , . . . , skn are distributed privately to the

users and the DMGM terminates by discarding all its random coin tosses.

To each user key ski there is a corresponding public-key pki and a name

idi that are part of the public user database {idi , pki }i (we may refer to

this table as: public user database). The table can be accessed through

a function tableLook(·), that given pki returns idi , the identity of the i-

th user. Without loss of generality we will assume that idi = i but in

practice idi may contain more information about the user.

(b) KeyGDM : this is an m-server protocol that with public input the parame-

ters 1κ , t, m, it enables the data mining servers M1 , . . . , Mm to produce

as public output the public key YDM that will be attached to the public key

of the system, which is denoted by Y = YDMGM ||YDM . At the completion

of the protocol each data mining server will also return the private output

Sj which will be a share of the virtual key skDM .

2. Sign: A probabilistic algorithm that given the system public key Y, a user™s

secret key ski , and a message M , it outputs a signature for the message M .

We write SIGN(Y, ski , M ) to denote the application of the signing algorithm.

A signature δ produced by the SIGN algorithm contains a mining tag denoted

by mtδ .

3. Verify: An algorithm for establishing the validity of a signature on a mes-

sage with respect to a system public key Y. Notice that VERIFY(Y, M, δ) ∈

{true, false}.

4. OPEN is an m-server protocol that given a signature δ, it enables the mining

servers S1 , . . . , Sm recover a value pki that can be used to identify a user

from the public user database, or the value ⊥. When it will be clear from

the context what servers are participating in the execution we will denote the

output of the protocol simply by OPEN(δ).

5. MINING: This is an m-server protocol that enables the mining servers to col-

laboratively compute some application-dependent usage characteristic func-

tion usageChar : N— ’ T where T is some arbitrary range. The protocol

MINING will be expressed as an algorithm that is executed by each mining

server locally and includes calls to the subprotocols MIX, tableLook(DEC),

and CMP that operate on the mining tags of a given vector of valid signa-

tures. Two concrete implementations will be presented in Section 4. When it

will be clear from the context what servers are participating in the execution

we will denote the output of the protocol by simply MINING(δ1 , . . . , δK ).

A ppt adversary A for a DMGS has access to the following oracles: Setup via

KeyGDMGM and KeyGDM ; OSign which receives a user™s identity i and a message

Privacy Preserving Data Mining within Anonymous Credential Systems 69

M , returns Sign(Y, ski , M ), and sets hist(Sign) = hist(Sign)||(i, M ); MINING and

tableLook; Corrupt which receives a group signature user™s identity i and returns

ski and sets Corr = Corr ∪ {i}.

The formalization of the security properties will be performed in the setting

of a single honest mining server. Subsequently, arguing the security of our multi-

server construction will be split in two steps: ¬rst we will prove that it satis¬es

the properties stated below in the single server setting; then we will show that it

satis¬es distribution-safety as de¬ned in the previous section, hence the advan-

tage cannot gain signi¬cant advantage from corrupting a minority of servers.

De¬nition 3. (security of DMGS) The security properties of data mining group

signatures are:

1. Correctness: We require that a scheme possesses “signature correctness” that

suggests the following probability is overwhelming:

⎡ ¤

YDMGM , public user database, sk1 , . . . , skn ← KeyGDMGM (1κ , n);

Pr ⎣ YDM , S ← KeyGDM (YDMGM , t = 1, m = 1); Y := YDMGM ||YDM ; ¦

δ ← Sign(Y, ski , M ) : true = Verify(Y, M, δ) § i = tableLook(OPEN(δ)

and similarly for “mining correctness” the following probability is overwhelm-

ing:

⎡ ¤