ρ ’ρ

R3 = Gρr , R4 = Gρr , R5 = H0 r Gρx , R6 = T1 x g1 xz (u)’ρxz ,

ρ

1 2

Then we employ the hash function H to compute

c ← H (M ||Y||M ||T1 ||T2 ||T3 ||T4 ||T5 ||R1 ||R2 ||R3 ||R4 ||R5 ||R6 )

Subsequently the following values are computed: sx = ρx + cx in Zp , sz =

ρz + cz in Zp , sz = ρz + cz in Zp , sxz = ρxz + cxz in Zp , sxz = ρxz + cxz

in Zp , sr = ρr + cr in Zq . The output of the signing algorithm is the tuple:

δ = T1 , T2 , T3 , T4 , T5 , c, sx , sz , sz , sxz , sxz , sr .

?

Verify: Signature veri¬cation is achieved by the following test: c = H Y||M ||T1 ||

s ’s

’c ˜ ’c ’c ’c

T2 ||T3 ||T4 ||T5 ||g1z usz T1 ||E||Gsr T3 ||Gsr T4 ||T5 Gsx H0 r ||T1 x g1 xz u’sxz

s s

,

1 2

where E = e(T2 , g2 )sx · e(g1 , g2 )’sxz · e(g1 , w)’sz · (e(g1 , g2 )/e(T2 , w))’c .

˜

OPEN: this m-server protocol is a modi¬cation of the DEC protocol of section 2.

The only essential di¬erence is that the test for signature validity based on

the public-key Y substitutes the veri¬cation of the proof that accompanies the

ciphertext there. Observe that the tuple mt = (T3 , T4 , T5 ) can be parsed out of

δ and corresponds to a reduced ciphertext in the terminology of section 2. The

mining servers execute the DEC protocol as described there to produce the value

Gxi = DEC(mt) that can be used to identify the signer in conjunction with the

tableLooK function. Recall that the user database contains entries of the form

{(idi , Gxi )}i and the table can be queried by the function tableLook(·) that on

input Gxi will return idi .

MINING: The MINING protocol will be an m-server protocol that is given as input

a vector of signatures (δ1 , . . . , δK ) (with the precondition that they are valid) out

of which their mining tags (mtδ1 , . . . , mtδK ) can be parsed; these correspond to

reduced ciphertexts of the underlying encryption scheme. We present two di¬er-

ent MINING protocols that employ the MIX, DEC, CMP as sub-protocols operating

over the mining tags that are parsed from the given vector of valid signatures.

(I) Usage histogram: The usage histogram functionality asks for a histogram

(idi , counti ) for i = 1, . . . , n where counti is the number of signatures signer idi

contributed.

Code snippet for MINING: usage histogram

Parse δ1 , . . . , δK to obtain mtδ1 , . . . , mtδK ;

mt1 . . . mtK ← MIX(mtδ1 , . . . , mtδK );

for i = 1 to K do idi ← tableLook(DEC(mti ));

SORT(id1 , . . . , idK );

Privacy Preserving Data Mining within Anonymous Credential Systems 73

The complexity of the usage-histogram functionality is equal to O(K log K +

Kdec + mix(K)), where dec is the cost of DEC and mix is the cost of MIX (note

that SORT is implemented locally by each server on its local output).

(II) Blinded usage histogram with outlier detection : Given the sequence of signa-

tures, we want to create a histogram that contains entries of the form (i, counti )

where i corresponds to one of the signers that produced some of the signatures

among δ1 , . . . , δK and counti corresponds to the number of signatures that were

contributed by this user; note that here i does not identify the user (it is not

equal to the user™s identity and not correlated with it ” it is simply a histogram-

speci¬c pseudonym for the user and is used only for the presentation of the his-

togram). Besides this, detecting outliers requires the mining servers to discover

the identity of signers that either over-use (e.g. spammers) or under-use (e.g.

those that haven™t seen enough advertisements) the system. To facilitate this

the algorithm takes two parameters lo, hi ∈ {1, . . . , K} and requires the recovery

of the identity of any signer whose usage is below (resp. above) lo (resp. hi):

Code snippet for MINING blinded usage histogram with outlier detection

Parse δ1 , . . . , δK to obtain mtδ1 , . . . , mtδK ;

mt1 . . . mtK ← MIX(mtδ1 , . . . , mtδK );

for i = 1 to K do

k = 0;

if mti = NIL then

k = k + 1;

count[k] = 1;

for j = i + 1 to K do

if mtj = NIL then

test ← CMP(mti , mtj );

if test == 1 then

count[k] = count[k] + 1;

mtj = NIL;

output k, count[k] ;

if (count[k] < lo) or (count[k] > hi)

then output id ← tableLook(DEC(mti ));

The complexity of the blinded usage histogram generation is O(K 2 cmp+mix(K)),

where cmp is the cost of CMP.

Correctness and Security Arguments. Below we argue how our construction spec-

i¬ed above satis¬es the model that we put forth in sections 2 and 3.

Our security arguments will be split into steps. First we will show that our

suite of multi-server protocols is t-distribution safe for m ≥ 2t ’ 1 servers. This

will enable us to reduce any adversary that takes advantage of the multi-server

nature of the system and corrupt a set of t ’ 1 servers to an adversary that

performs the same attack against a single honest server. Then, we will show

that our scheme in the single server setting satis¬es the properties we put forth

in section 3.

74 A. Kiayias, S. Xu, and M. Yung

Theorem 2. The suite of m-server protocol suite KeyGDM , OPEN, DEC, MIX, CMP

is t-distribution-safe under the discrete-logarithm assumption and that H is a

random oracle that is controlled by the simulator provided that m ≥ 2t ’ 1.

Theorem 3. The DMGS scheme introduced above satis¬es (i) correctness, (ii)

traceability, (iii) anonymity, based on: the Strong-Di¬e Hellman Assumption

over G1 , G2 , the Decisional Di¬e-Hellman Assumption over G1 and the as-

sumption that H is modeled as a random oracle.

5 Conclusion

We conceptualized the notion of privacy preserving data mining within anony-

mous credential systems. We advocated this general notion as fundamental to

adapting anonymous credentials in general systems. We then instantiated it in

the context of group signatures as “data mining group signatures” and presented

two particular instantiations of it. We included a modeling of the notion, intro-

duced distribution safety as a way to modularly argue the security of distributed

cryptosystems and we presented an explicit construction of our notion. A num-

ber of issues remain for further investigation: Extending the mining instances

to other cases crucial in transaction systems while maintaining e¬ciency is an

important direction. Our de¬nition of the notion is based on combining the

properties of correctness, traceability and anonymity; considering attackers that

adaptively decide which of the properties to violate (as in a simulation-based

“ideal functionality” formulation) is an open subject to consider. Even further,

presenting constructions without the random oracle idealization or in a fully

concurrent execution model are open questions as well.

References

1. Aggarwal, G., Mishra, N., Pinkas, B.: Secure computation of the k th-ranked ele-

ment. In: Cachin and Camenisch [5], pp.40“55

2. Ateniese, G., Camenisch, J., Joye, M., Tsudik, G.: A practical and provably secure

coalition-resistant group signature scheme. In: Bellare, M. (ed.) CRYPTO 2000.

LNCS, vol. 1880. Springer, Heidelberg (2000)

3. Bellare, M., Micciancio, D., Warinschi, B.: Foundations of group signatures: Formal

de¬nitions, simpli¬ed requirements, and a construction based on general assump-

tions. In: Biham, E. (ed.) Advances in Cryptology “ EUROCRYPT 2003, Warsaw,

Poland. LNCS, vol. 2656. Springer, Heidelberg (2003)

4. Boneh, D., Boyen, X., Shacham, H.: Short group signatures. In: Franklin, M. (ed.)

CRYPTO 2004. LNCS, vol. 3152, pp. 41“55. Springer, Heidelberg (2004)

5. Cachin, C., Camenisch, J. (eds.): Advances in Cryptology - EUROCRYPT 2004,

International Conference on the Theory and Applications of Cryptographic Tech-

niques, nterlaken, Switzerland, May 2-6, 2004. LNCS, vol. 3027. Springer, Heidel-

berg (2004)

6. Camenisch, J., Hohenberger, S., Kohlweiss, M., Lysyanskaya, A., Meyerovich, M.:

How to win the clonewars: e¬cient periodic n-times anonymous authentication. In:

Juels, A., Wright, R.N., di Vimercati, S.D.C. (eds.) ACM Conference on Computer

and Communications Security, pp. 201“210. ACM, New York (2006)

Privacy Preserving Data Mining within Anonymous Credential Systems 75

7. Camenisch, J., Hohenberger, S., Lysyanskaya, A.: Compact e-cash. In: Cramer,

R. (ed.) EUROCRYPT 2005. LNCS, vol. 3494, pp. 302“321. Springer, Heidelberg

(2005)

8. Chaum, D.: Blind signatures for untraceable payments. In: Crypto (1982)