2 21’s s(1 ’ s)

This elementary geometrical fact - that 1/t(1 ’ t) is the unique density

(up to scalar multiple) which is invariant under all the φa - was given a deep

philosophical interpretation by Jaynes, [?]:

103

4.5. THE BETA DISTRIBUTIONS.

Suppose we have a possible event which may or may not occur, and we

have a population of individuals each of whom has a clear opinion (based on

ingrained prejudice, say, from reading the newspapers or watching television)

of the probability of the event being true. So Mr. A assigns probability p(A)

to the event E being true and (1-p(A)) as the probability of its not being rue,

while Mr. B assigns probability P(B) to its being true and (1-p(B)) to its not

being true and so on.

Suppose an additional piece of information comes in, which would have a

(conditional) probability x of being generated if E were true and y of this infor-

mation being generated if E were not true. We assume that both x and y are

positive, and that every individual thinks rationally in the sense that on the ad-

vent of this new information he changes his probability estimate in accordance

with Bayes™ law, which says that the posterior probability p is given in terms

of the prior probability p by

px x

p= = φa (p) where a := .

px + (1 ’ p)y y

We might say that the population as a whole has been invariantly prejudiced if

any such additional evidence does not change the proportion of people within

the population whose belief lies in a small interval. Then the density describing

this state of knowledge (or rather of ignorance) must be the density

1

ρ(p) = .

p(1 ’ p)

According to this reasoning of Jaynes, we take the above density to describe the

prior probability an individual (thought of as a population of subprocessors in his

brain) would assign to the probability of an outcome of a given experiment. If a

series of experiments then yielded M successes and N failures,Bayes™ theorem (in

its continuous version) would then yield the posterior distribution of probability

assignments as being proportional to

pM ’1 (1 ’ p)N ’1

the Beta distribution with parameters M, N .

104 CHAPTER 4. SPACE AND TIME AVERAGES

Chapter 5

The contraction ¬xed point

theorem

5.1 Metric spaces

Until now we have used the notion of metric quite informally. It is time for a

formal de¬nition. For any set X, we let X — X (called the Cartesian product

of X with itself) denote the set of all ordered pairs of elements of X. (More

generally, if X and Y are sets, we let X — Y denote the set of all pairs (x, y)

with x ∈ and y ∈ Y , and is called the Cartesian product of X with Y .)

A metric for a set X is a function d from X to the real numbers R,

d:X —X ’R

such that for all x, y, z ∈ X

1. d(x, y) = d(y, x)

2. d(x, z) ¤ d(x, y) + d(y, z)

3. d(x, x) = 0

4. If d(x, y) = 0 then x = y.

The inequality in 2) is known as the triangle inequality since if X is the

plane and d the usual notion of distance, it says that the length of an edge of a

triangle is at most the sum of the lengths of the two other edges. (In the plane,

the inequality is strict unless the three points lie on a line.)

Condition 4) is in many ways inessential, and it is often convenient to drop

it, especially for the purposes of some proofs. For example, we might want to

consider the decimal expansions .49999 . . . and .50000 . . . as di¬erent, but as

having zero distance from one another. Or we might want to “identify” these

two decimal expansions as representing the same point.

105

106 CHAPTER 5. THE CONTRACTION FIXED POINT THEOREM

A function d which satis¬es only conditions 1) - 3) is called a pseudo-

metric.

A metric space is a pair (X, d) where X is a set and d is a metric on X.

Almost always, when d is understood, we engage in the abuse of language and

speak of “the metric space X”.

Similarly for the notion of a pseudo-metric space.

In like fashion, we call d(x, y) the distance between x and y, the function

d being understood.

If r is a positive number and x ∈ X, the (open) ball of radius r about x is

de¬ned to be the set of points at distance less than r from x and is denoted by

Br (x). In symbols,

Br (x) := {y| d(x, y) < r}.

If r and s are positive real numbers and if x and z are points of a pseudo-

metric space X, it is possible that Br (x) © Bs (z) = …. This will certainly be

the case if d(x, z) > r + s by virtue of the triangle inequality. Suppose that this

intersection is not empty and that

w ∈ Br (x) © Bs (z).

If y ∈ X is such that d(y, w) < min[r ’ d(x, w), s ’ d(z, w)] then the triangle

inequality implies that y ∈ Br (x) © Bs (z). Put another way, if we set t :=

min[r ’ d(x, w), s ’ d(z, w)] then

Bt (w) ‚ Br (x) © Bs (z).

Put still another way, this says that the intersection of two (open) balls is either

empty or is a union of open balls. So if we call a set in X open if either it

is empty, or is a union of open balls, we conclude that the intersection of any

¬nite number of open sets is open, as is the union of any number of open sets.

In technical language, we say that the open balls form a base for a topology on

X.

A map f : X ’ Y from one pseudo-metric space to another is called con-

tinuous if the inverse image under f of any open set in Y is an open set in

X. Since an open set is a union of balls, this amounts to the condition that

the inverse image of an open ball in Y is a union of open balls in X, or, to use

the familiar , δ language, that if f (x) = y then for every > 0 there exists a

δ = δ(x, ) > 0 such that

f (Bδ (x)) ‚ B (y).

Notice that in this de¬nition δ is allowed to depend both on x and on . The

map is called uniformly continuous if we can choose the δ independently of x.

An even stronger condition on a map from one pseudo-metric space to an-

other is the Lipschitz condition. A map f : X ’ Y from a pseudo-metric

space (X, dX ) to a pseudo-metric space (Y, dY ) is called a Lipschitz map with

Lipschitz constant C if

dY (f (x1 ), f (x2 )) ¤ CdX (x1 , x2 ) ∀x1 , x2 ∈ X.

107

5.1. METRIC SPACES

Clearly a Lipschitz map is uniformly continuous.

For example, suppose that A is a ¬xed subset of a pseudo-metric space X.

De¬ne the function d(A, ·) from X to R by

d(A, x) := inf{d(x, w), w ∈ A}.

The triangle inequality says that

d(x, w) ¤ d(x, y) + d(y, w)

for all w, in particular for w ∈ A, and hence taking lower bounds we conclude

that

d(A, x) ¤ d(x, y) + d(A, y).

or

d(A, x) ’ d(A, y) ¤ d(x, y).

Reversing the roles of x and y then gives

|d(A, x) ’ d(A, y)| ¤ d(x, y).

Using the standard metric on the real numbers where the distance between a

and b is |a ’ b| this last inequality says that d(A, ·) is a Lipschitz map from X

to R with C = 1.

A closed set is de¬ned to be a set whose complement is open. Since the

inverse image of the complement of a set (under a map f ) is the complement