additive property: if X1 and X2 are independent χ 2

normal distribution: the χ 2 distribution, the t

random variables with k1 and k2 df respectively,

distribution, and the F distribution. We will see

then X1 + X2 is a χ 2 (k1 + k2 ) random variable.

these distributions often in settings where we need

It follows then that a χ 2 (k) random variable can

to know about the uncertainty of an estimated

be thought of as a sum of k independent χ 2 (1)

mean or variance, or compare estimates of means

random variables.

or variances.

Several characteristics of the χ 2 distribution can

be noticed. First, all of the distributions are skewed

2.7.8 The χ 2 Distribution. The χ 2 distribution to the left, but distributions with small numbers of

is de¬ned as that of the sum of k independent degrees of freedom are more skewed than those

squared N (0, 1) random variables. It is therefore with large numbers of degrees of freedom. In

de¬ned only on the positive half of the real line. fact, the χ 2 (30) distribution is very nearly normal,

The form of this distribution function depends in accordance with the additive property and the

upon a single parameter, k, referred to as the Central Limit Theorem [2.7.5]. Second, only the

degrees of freedom (df).11 distributions with one and two degrees of freedom

have their modes (i.e., their most likely values) at

11 The expression degrees of freedom is used frequently

the origin. Third, the spread of the distributions

in this book. Here it has two equivalent technical interpre-

tations. Speci¬cally, if X1 , . . . , Xn are independent, identi- depends strongly upon the number of degrees of

cally distributed N (µ, σ 2 ) random variables, then χ 2 = freedom.

n

1

i=1 (Xi = X) is distributed χ (n ’ 1). This sum of

2 2

σ2

squared deviations can be re-expressed as a sum of n ’ 1 interpretation is geometrical. The deviations xi ’ x can be

squared N (0, 1) random variables. This gives the ¬rst interpre- arranged in an n-dimensional random vector (x1 ’ x, . . . , xn ’

x)T . This vector takes values in an (n ’ 1)-dimensional

tation of degrees of freedom, which is frequently encountered

in climate research: χ 2 contains information from n ’ 1 in- subspace since the deviations are constrained to sum to zero.

dependent, identically distributed random variables. The other See also [6.6.1] and Section 6.8.

2.7: Example of Continuous Random Variables 37

It may be shown [208] that the jth moments of T

for j ≥ k do not exist.

0.0 0.1 0.2 0.3 0.4

df

1 The t(k) distribution is shown in Figure 2.6 for

2 four values of the degrees of freedom parameter

10

k. The density function f T (t; 1) for T with

30

k = 1 degree of freedom does tend to zero

as t ’ ±∞, but too slowly for the integral

t f T (t; 1) dt to exist. The convergence is faster

when k = 2, so that the ¬rst moment exists but not

the second moment. The convergence increases

4 with the increasing numbers of degrees of

-4 -2 0 2

freedom. Ultimately, the t distribution converges

to the standard normal distribution. The difference

Figure 2.6: Probability density functions for t(k) between the distributions is small even when k =

random variables with 1, 2, 10, and 30 degrees of 10, and it becomes negligible for k ≥ 30.

freedom. The t(k) distribution is partially tabulated in

Appendix F.

In general, if X ∼ χ 2 (k), then 2.7.10 The F Distribution. Another of the

E(X) = k sampling distributions closely related to the

normal distribution is the F distribution. A random

Var(X) = 2k.

variable F is said to have an F distribution with k

and l degrees of freedom, that is, F ∼ F(k, l), if

the density function of F, f F ( f ; k, l), is given by

2.7.9 The t distribution. A random variable T (k/l)k/2 ((k + l)/2)

f F ( f ; k, l) =

has the t distribution with k degrees of freedom,

(k/2) (l/2)

that is, T ∼ t(k), if its probability density function

k ’(k+l)/2

— f (k’2)/2 1 + f .

is given by

l

((k + 1)/2)(1 + t 2 /k)’(k+1)/2

f T (t; k) = .

√ This distribution arises in estimation and

kπ (k/2) testing problems when statistics are developed

T random variables are strongly related to normal that can be expressed as a constant times a

and χ 2 random variables. In particular, if A and B ratio of independent χ random variables (hence

2

the connection to the normal distribution”see

are independent random variables such that

[2.7.8]).

A ∼ N (0, 1) and B ∼ χ 2 (k), In particular, if X and Y are independent random

variables such that X ∼ χ 2 (k) and Y ∼ χ 2 (l),

then

then

A

∼ t(k).

√ X/k

∼ F(k, l). (2.29)

B/k

Y/l

The t distribution was introduced by W.L.

The ¬rst two central moments are

Gosset under the pseudonym ˜Student™”so is

l

often called the Student™s t distribution. µ = E(F) =

l ’2

The t distribution is symmetric about zero.

When T has more than one degree of freedom, the for l > 2 and

¬rst central moment is zero (see e.g., Kalb¬‚eisch

2l 2 (k + l ’ 2)

[208]), Var(F) =

k(l ’ 2)2 (l ’ 4)

E(T) = 0 for k ≥ 2.

for l > 4. As for the t distribution, not all moments

The ¬rst moment does not exist when k = 1. of the F distribution exist (see Kalb¬‚eisch [208]).

Similarly, the second central moment exists for The F(k, l) density function is shown in

k ≥ 3, where Figure 2.7 for three combinations of (k, l). The

distribution is skewed for all values of l. For

k

Var(T) = for k ≥ 3. ¬xed k, the skewness decreases slightly with

k’2

2: Probability Theory

38

The χ 2 distribution with 2 df is an exponential

distribution with θ = 2.

df

0.0 0.2 0.4 0.6

(5,5)

2.7.12 Example: Waiting Times in a Poisson

(5,20) Process. The exponential distribution also arises

(5,100) when studying waiting times in a Poisson process.

We used a Poisson process in [2.4.4] to model the

occurrence of wind speed peaks over a threshold.

If the threshold is large, the distribution of waiting

times is useful for making inferences about the

0 1 2 3 4 5 6

frequency with which we might expect damaging

winds. Here, we will the derive the waiting time

Figure 2.7: Probability density functions for distribution for a Poisson process with intensity ».

F(k, l) random variables with (k, l) = (5, 5), Let T be the waiting time for the ¬rst event