variable because events in the Poisson process

occur randomly. Let FT (·) be the cumulative

increasing l. In fact, the distribution converges to

distribution function of T. That is, FT (t) =

a normalized χ 2 distribution as l ’ ∞.

P (T < t) = 1’P (T ≥ t). The event T ≥ t occurs

The F distribution is partially tabulated in

when no events take place in the time interval

Appendix G.

(0, t). Equation (2.10) can be used to show that

2.7.11 The Exponential Distribution. The

P no events in (0, t) = e’»t

distribution of wind energy, which is proportional

to the square of wind speed, provides an and therefore that

interesting application of the χ 2 distribution.

1 ’ e’»t if t ≥ 0

FT (t) =

To a ¬rst order of approximation, the zonal

0 otherwise.

and meridional components of the wind are

normally distributed and independent (but see Hence, the waiting time is exponentially dis-

[2.6.6] and also Cook [89] and Holzer [180]). tributed with θ = »’1 . Consequently, the mean

Thus the wind energy, when properly scaled, waiting time is inversely proportional to the inten-

is approximately distributed χ 2 (2). The latter sity of the Poisson process.

distribution, illustrated in Figure 2.5, is also

an example of an exponential distribution. The

2.8 Random Vectors

likelihood of observing a particular wind energy

falls off exponentially with magnitude.

2.8.1 Continuous Random Vectors. A contin-

The density function of an exponential random

uous random vector X is a vector of continuous

variable X is given by

random variables.

θ ’1 e’x/θ if x > 0

f X (x) = The climate system has a myriad of examples

0 otherwise, of continuous random vectors. One example is

the monthly mean 300 hPa height ¬eld Z, either

and the corresponding cumulative distribution

as simulated by a climate model, or as analysed

function is given by

from observations (Figure 1.1). In both cases,

if x ¤ 0

0

FX (x) = the random vector contains several hundred or

1 ’ e’x/θ if x > 0. thousand entries, each representing an observation

at a different location. Another example is the

The mean and variance are

surface temperature ¬eld T, which contains screen

µ = θ and σ 2 = θ 2 . temperature13 observations over the land and

ocean surfaces. If we want to study relationships

The L-moments are

»(1) = θ

12 We can assume that we start observing the process just

after the occurrence of an event, so the waiting time for the

»(2) = θ/2 ¬rst event is equivalent to the waiting time between events.

13 ˜Screen temperature™ is taken 2 m above the surface. The

γ1L = 1/3

word ˜screen™ alludes to the enclosures”Stevenson screens”

γ2L = 1/6. that are used to house land-based thermometers.

2.8: Random Vectors 39

between geopotential and surface temperature, where the functions g1 and g2 select the kth and

then we might form an even larger random vector jth components of X respectively.

by combining Z and T. Note that (2.30) holds regardless of the

correlation between the components Xk and X j .

2.8.2 Joint Probability Density Function.

The joint probability density function of an m- 2.8.5 Independent Random Variables. The

dimensional random vector X is a non-negative, de¬nition of independent random variables also

continuous function de¬ned on Rm for which extends smoothly from the discrete to the

Rm f X (x) d x = 1. continuous case.

The cumulative distribution function also ex- Let X be a random vector and let Xi and X j be any

tends to the multivariate case in a natural way. pair of elements in the vector. The components of

However, the concept is not as useful as in the uni- X are said to be pairwise independent if for every

variate case, and therefore will not be discussed. (i, j) the joint density function of Xi and X j can

be written as the product of the marginal density

2.8.3 Marginal Distributions. In our discussion functions of Xi and X j .

of discrete multivariate distributions [2.5.2], the The components of X are said to be jointly

marginal distribution of one variable was found independent if f (x) = m f X i (xi ).

i=1

X

by summing the joint probability function over

all combinations of values taken by the remaining

variables. Since integration is the continuous 2.8.6 Conditional Density Functions. Finally,

variable analogue to summation, the marginal the concept of the conditional distribution is

probability density function for the kth variable in extended to the continuous case. However, here

it is likelihoods, rather than probabilities, that

X, say Xk , is de¬ned by

are scaled. We saw that in the discrete case

f X k (x) [2.5.4], the act of conditioning on the outcome

of a variable reduced the number of outcomes

= fX (x1 , . . . , xk’1 , x, xk+1 , . . . , xm ) d xk , that were possible by some ¬nite proportion.

Similarly, conditioning in the continuous case

Rm’1

restricts possible realizations of the random vector

where xk = (x1 , . . . , xk’1 , xk+1 , . . . , xm ). to a hyper-space of the original m-dimensional

vector space. The conditional probability density

2.8.4 Expectation of a Weighted Sum of function is de¬ned as follows.

the Components of a Random Vector. The Let X be a random vector of the form (X1 , X2 ),

expected value of the kth component of X is the where X1 and X2 are also both random vectors.

mean of the marginal distribution The conditional probability density function of X1 ,

given X2 = x2 , is

∞

E(Xk ) = xk fX k (xk )d xk

f X 1 X 2 (x1 , x2 )

’∞

f X 1 | X 2 =x2 (x1 ) = (2.31)

f X 2 (x2 )

= xk f X (x) d x,

Rm

for all x2 such that f X 2 (x2 ) is nonzero.

where x = (x1 , . . . , xm )T .

The expected value of a linear combination of

2.8.7 The Multivariate Mean, the Covariance

two components of X is

Matrix, and the Correlation Matrix. The long-

E aXk + bX j + c term mean value of repeated realizations of an

m-dimensional random vector X is given by

= (axk + bx j + c) fX (x) d x

Rm

µ X = E(X) = x fX (x) d x.

= a E(Xk ) + b E X j + c . (2.30) Rm

The same result can be obtained directly from Note that the elements of µ X are the means of

(2.15) and (2.16) as follows: the corresponding marginal distributions. We will

usually refer only to µ rather than µ X unless

E a g1 (X) + b g2 (X) + c clarity requires that speci¬c reference be made to

= a E g1 (X) + b E g2 (X) + c, the random vector.

2: Probability Theory

40

Jointly distributed random variables often have A possible dif¬culty with covariance as a

a tendency to vary jointly.14 This ˜co-variability™ measure of the joint variability of a pair of random

may be quantitatively described by the multivariate variables is that covariance is not scale invariant.

analogue of variance, namely the covariance As with the transports that covariances often

matrix: represent in climate problems (see Section 8.2), a

change in units has a profound effect on the size

Σ X , X = E (X ’ µ)(X ’ µ)T (2.32) of the covariance. If all realizations of Xi and X j

are multiplied by constants ci and c j respectively,

= (x ’ µ)(x ’ µ)T fX (x) d x.

the covariance will increase by a factor of ci c j .

Rm

However, the variances of Xi and X j also increase

As above, we will drop the reference to the random by factors of c2 and c2 .

i j

vector in the notation for the covariance matrix Correlation (or cross-correlation) is a measure

unless a need for clarity dictates otherwise. of covariability that is scale invariant.

The (i, j)th element of Σ contains the

The correlation between two random variables Xi