pendence assumptions.

Exercise 6.48. Show that the hypothesis of Theorem 6.18 can be weakened:

all one needs to assume is that X1 , . . . , X are mutually independent, and

that Y1 , . . . , Y are mutually independent.

Exercise 6.49. Let Y1 , . . . , Y be mutually independent random variables,

where each Yi is uniformly distributed on {0, . . . , m ’ 1}. For i = 1, . . . , ,

de¬ne Zi := i jYj . Let n be a prime greater than . Let S be any ¬nite

j=1

— . Let A be the event that for some (a , . . . , a ) ∈ S, we have

subset of Z 1

Zi ≡ ai (mod n) for i = 1, . . . , . Show that

P[A] ¤ |S|/n + n/m.

Exercise 6.50. Let X be a set of size n ≥ 1. Let F be a random function

from X into X . Let G be a random permutation of X . Let x1 , . . . , x be

distinct, ¬xed elements of X . Show that

( ’ 1)

∆[F (x1 ), . . . , F (x ); G(x1 ), . . . , G(x )] ¤ .

2n

Exercise 6.51. Let H be a family hash functions from A to Z such that

(i) each h ∈ H maps A injectively into Z, and (ii) there exists , with

0 ¤ ¤ 1, such that ∆[H(a); H(a )] ¤ for all a, a ∈ A, where H is

uniformly distributed over H. Show that |H| ≥ (1 ’ )|A|.

6.9 Measures of randomness and the leftover hash lemma (—)

In this section, we discuss di¬erent ways to measure “how random” a prob-

ability distribution is, and relations among them. Consider a distribution

de¬ned on a ¬nite sample space V. In some sense, the “most random” dis-

tribution on V is the uniform distribution, while the least random would be

a “point mass” distribution, that is, a distribution where one point v ∈ V in

the sample space has probability 1, and all other points have probability 0.

We de¬ne three measures of randomness. Let X be a random variable

taking values on a set V of size N .

1. We say X is δ-uniform on V if the statistical distance between X

and the uniform distribution on V is equal to δ; that is,

1

|P[X = v] ’ 1/N |.

δ=

2

v∈V

6.9 Measures of randomness and the leftover hash lemma (—) 137

2. The guessing probability γ(X) of X is de¬ned to be

γ(X) := max{P[X = v] : v ∈ V}.

3. The collision probability κ(X) of X is de¬ned to be

P[X = v]2 .

κ(X) :=

v∈V

Observe that if X is uniformly distributed on V, then it is 0-uniform on V,

and γ(X) = κ(X) = 1/N. Also, if X has a point mass distribution, then it is

(1 ’ 1/N )-uniform on V, and γ(X) = κ(X) = 1. The quantity log2 (1/γ(X))

is sometimes called the min entropy of X, and the quantity log2 (1/κ(X)) is

sometimes called the Renyi entropy of X. The collision probability κ(X)

has the following interpretation: if X and X are identically distributed

independent random variables, then κ(X) = P[X = X ] (see Exercise 6.26).

We ¬rst state some easy inequalities:

Theorem 6.19. Let X be a random variable taking values on a set V of

size N , such that X is δ-uniform on V, γ := γ(X), and κ := κ(X). Then

we have:

(i) κ ≥ 1/N ;

(ii) γ 2 ¤ κ ¤ γ ¤ 1/N + δ.

Proof. Part (i) is immediate from Exercise 6.26. The other inequalities are

left as easy exercises. 2

This theorem implies that the collision and guessing probabilities are min-

imal for the uniform distribution, which perhaps agrees with ones intuition.

While the above theorem implies that γ and κ are close to 1/N when δ is

small, the following theorem provides a converse of sorts:

Theorem 6.20. If X is δ-uniform on V, κ := κ(X), and N := |V|, then

1 + 4δ 2

κ≥ .

N

Proof. We may assume that δ > 0, since otherwise the theorem is already

true, simply from the fact that κ ≥ 1/N .

For v ∈ V, let pv := P[X = v]. We have δ = 1 v |pv ’ 1/N |, and hence

2

138 Finite and discrete probability distributions

where qv := |pv ’ 1/N |/(2δ). So we have

1= v qv ,

1 2

¤ qv (by Exercise 6.25)

N v

1

(pv ’ 1/N )2

=

4δ 2 v

1

p2 ’ 1/N )

= ( (again by Exercise 6.25)

v

4δ 2 v

1

(κ ’ 1/N ),

=

4δ 2

from which the theorem follows immediately. 2

We are now in a position to state and prove a very useful result which,

intuitively, allows us to convert a “low quality” source of randomness into

a “high quality” source of randomness, making use of a universal family of

hash functions (see §6.7.1).

Theorem 6.21 (Leftover hash lemma). Let H be a universal family of

hash functions from A to Z, where Z is of size n. Let H denote a random

variable with the uniform distribution on H, and let A denote a random

variable taking values in A, and with H, A independent. Let κ := κ(A).

Then (H, H(A)) is δ-uniform on H — Z, where

√

δ ¤ nκ/2.

Proof. Let Z denote a random variable uniformly distributed on Z, with

H, A, Z mutually independent. Let m := |H| and δ := ∆[H, H(A); H, Z].

Let us compute the collision probability κ(H, H(A)). Let H have the

same distribution as H and A have the same distribution as A, with

H, H , A, A mutually independent. Then

κ(H, H(A)) = P[H = H § H(A) = H (A )]

= P[H = H ]P[H(A) = H(A )]

1

P[H(A) = H(A ) | A = A ]P[A = A ] +

=

m

P[H(A) = H(A ) | A = A ]P[A = A ]

1

¤ (P[A = A ] + P[H(A) = H(A ) | A = A ])

m

6.9 Measures of randomness and the leftover hash lemma (—) 139

1

¤ (κ + 1/n)

m

1

= (nκ + 1).

mn

Applying Theorem 6.20 to the random variable (H, H(A)), which takes

values on the set H — Z of size N := mn, we see that 4δ 2 ¤ nκ, from which

the theorem immediately follows. 2

Example 6.28. Suppose A is uniformly distributed over a subset A of A,

where |A | ≥ 2160 , so that κ(A) ¤ 2’160 . Suppose that H is a universal

family of hash functions from A to Z, where |Z| ¤ 264 . If H is uniformly