message.

Now suppose that an adversary is trying to trick Bob into accepting an

inauthentic message (i.e., one not originating from Alice). Assuming that

H is a pairwise independent family of hash functions, it is not too hard

to see that the adversary can succeed with probability no better than 1/n,

regardless of the strategy or computing power of the adversary. Indeed, on

the one hand, suppose the adversary gives Bob a pair (a , z ) at some time

6.7 Hash functions 129

before Alice sends her message. In this case, the adversary knows nothing

about the hash function, and so the correct value of the hash code of a

is completely unpredictable: it is equally likely to be any element of Z.

Therefore, no matter how clever the adversary is in choosing a and z , Bob

will accept (a , z ) as authentic with probability only 1/n. On the other hand,

suppose the adversary waits until Alice sends her message, intercepting the

message/hash code pair (a, z) sent by Alice, and gives Bob a pair (a , z ),

where a = a, instead of the pair (a, z). Again, since the adversary does not

know anything about the hash function other than the fact that the hash

code of a is equal to z, the correct hash code of a is completely unpredictable,

and again, Bob will accept (a , z ) as authentic with probability only 1/n.

One can easily make n large enough so that the probability that an ad-

versary succeeds is so small that for all practical purposes it is impossible

to trick Bob (e.g., n ≈ 2100 ).

More formally, and more generally, one can de¬ne an -forgeable mes-

sage authentication scheme to be a family H of hash functions from A

to Z with the following property: if H is uniformly distributed over H, then

(i) for all a ∈ A and z ∈ Z, we have P[H(a) = z] ¤ , and

(ii) for all a ∈ A and all functions f : Z ’ A and g : Z ’ Z, we have

P[A = a § H(A ) = Z ] ¤ ,

where Z := H(a), A := f (Z), and Z := g(Z).

Intuitively, part (i) of this de¬nition says that it is impossible to guess the

hash code of any message with probability better than ; further, part (ii)

of this de¬nition says that even after seeing the hash code of one message, it

is impossible to guess the hash code of a di¬erent message with probability

better than , regardless the choice of the ¬rst message (i.e., the value a) and

regardless of the strategy used to pick the second message and its putative

hash code, given the hash code of the ¬rst message (i.e., the functions f and

g).

Exercise 6.36. Suppose that a family H of hash functions from A to Z is

an -forgeable message authentication scheme. Show that ≥ 1/|Z|.

Exercise 6.37. Suppose that H is a family of hash functions from A to Z

and that |A| > 1. Show that if H satis¬es part (ii) of the de¬nition of an

-forgeable message authentication scheme, then it also satis¬es part (i) of

the de¬nition.

130 Finite and discrete probability distributions

Exercise 6.38. Let H be a family of hash functions from A to Z. For

≥ 0, we call H pairwise -predictable if the following holds: for H

uniformly distributed over H, for all a, a ∈ A, and for all z, z ∈ Z, we have

P[H(a) = z] ¤ and

P[H(a) = z] > 0 and a = a implies P[H(a ) = z | H(a) = z] ¤ .

(a) Show that if H is pairwise -predictable, then it is an -forgeable

message authentication scheme.

(b) Show that if H is pairwise independent, then it is pairwise 1/|Z|-

predictable. Combining this with part (a), we see that if H is pair-

wise independent, then it is a 1/|Z|-forgeable message authentication

scheme (which makes rigorous the intuitive argument given above).

(c) Give an example of a family of hash functions that is an -forgeable

message authentication scheme for some < 1, but is not pairwise

-predictable for any < 1.

Exercise 6.39. Give an example of an -forgeable message authentication

scheme, where is very small, but where if Alice authenticates two distinct

messages using the same hash function, an adversary can easily forge the

hash code of any message he likes (after seeing Alice™s two messages and their

hash codes). This shows that, as we have de¬ned a message authentication

scheme, Alice should only authenticate a single message per hash function

(t messages may be authenticated using t hash functions).

Exercise 6.40. Let H be an -universal family of hash functions from A to

Y (see Exercise 6.35), and let H be a pairwise independent family of hash

functions from Y to Z. De¬ne the composed family H —¦ H of hash functions

from A to Z as H —¦H := {φh ,h : h ∈ H , h ∈ H}, where φh ,h (a) := h (h(a))

for φh ,h ∈ H —¦H and for a ∈ A. Show that H —¦H is an ( +1/|Z|)-forgeable

message authentication scheme.

6.8 Statistical distance

This section discusses a useful measure “distance” between two random

variables. Although important in many applications, the results of this

section (and the next) will play only a very minor role in the remainder of

the text.

Let X and Y be random variables which both take values on a ¬nite set

6.8 Statistical distance 131

V. We de¬ne the statistical distance between X and Y as

1

|P[X = v] ’ P[Y = v]|.

∆[X; Y ] :=

2

v∈V

Theorem 6.14. For random variables X, Y, Z, we have

(i) 0 ¤ ∆[X; Y ] ¤ 1,

(ii) ∆[X; X] = 0,

(iii) ∆[X; Y ] = ∆[Y ; X], and

(iv) ∆[X; Z] ¤ ∆[X; Y ] + ∆[Y ; Z].

Proof. Exercise. 2

Note that ∆[X; Y ] depends only on the individual distributions of X and

Y , and not on the joint distribution of X and Y . As such, one may speak of

the statistical distance between two distributions, rather than between two

random variables.

Example 6.26. Suppose X has the uniform distribution on {1, . . . , n}, and

Y has the uniform distribution on {1, . . . , n’k}, where 0 ¤ k ¤ n’1. Let us

compute ∆[X; Y ]. We could apply the de¬nition directly; however, consider

the following graph of the distributions of X and Y :

1/(n ’ k)

A

1/n

B C

n’k

0 n

The statistical distance between X and Y is just 1/2 times the area of

regions A and C in the diagram. Moreover, because probability distributions

sum to 1, we must have

area of B + area of A = 1 = area of B + area of C,

and hence, the areas of region A and region C are the same. Therefore,

∆[X; Y ] = area of A = area of C = k/n. 2

The following characterization of statistical distance is quite useful:

Theorem 6.15. Let X and Y be random variables taking values on a set

132 Finite and discrete probability distributions

V. For any W ⊆ V, we have

∆[X; Y ] ≥ |P[X ∈ W] ’ P[Y ∈ W]|,

and equality holds if W is either the set of all v ∈ V such that P[X = v] <

P[Y = v], or the complement of this set.

Proof. Suppose we partition the set V into two sets: the set V0 consisting

of those v ∈ V such that P[X = v] < P[Y = v], and the set V1 consisting of

those v ∈ V such that P[X = v] ≥ P[Y = v]. Consider the following rough

graph of the distributions of X and Y , where the elements of V0 are placed

to the left of the elements of V1 :

Y A

C

X B