This is demonstrated in the following example.

6.5.3 Multivariate Tests of the Mean. There

are at least two ways to test the global null Let X be an m-dimensional normal random vector

hypothesis of the equality of mean ¬elds. One is with mean µ = (0, . . . , 0)T and covariance matrix

6: The Statistical Test of a Hypothesis

110

Σ = I.4 Let Y be another m-dimensional normal 6.5.6 Guess Patterns. The spatial degrees of

random vector de¬ned by Y = X + a where a = freedom may be reduced by approximating the full

(2, 0, . . . , 0) and let y be a realization of Y. We m-dimensional ¬elds X as a linear combination of

want to test the null hypothesis, H0 , that y belongs ˜

a set of m patterns p i , as

to the population de¬ned by X. The Mahalanobis

˜

m

test statistic

X≈ ±i p i . (6.9)

D2 (y) = (y ’ µ)T Σ’1 (y ’ µ)

i=1

The coef¬cients ±i are usually ¬tted by a least

m

= yi2 (6.8) square approximation (see Chapter 8). The guess

patterns p i should be speci¬ed independently of

i=1

the outcome of the experiment.

has a χ 2 distribution with m degrees of freedom There are various ways to obtain guess patterns.

under H0 . Its expected value under the alternative

hypothesis, which is true by construction, is 1 Patterns known to yield ef¬cient approxi-

ED =2 2 + m. These expected values, and

2

mations of the analysed ¬elds X: examples

corresponding 5% signi¬cance level values for the are Empirical Orthogonal Functions (EOFs;

test statistic under H0 , are: see Chapter 13) or, in case of a spherical

geometry, surface spherical harmonics.

m E D under Ha χ5%

2 m under H

0

2 Problem-related patterns: patterns that were

found as signals in similar but independent

1 5 3.8

GCM experiments or patterns that were

2 6 6.0

diagnosed from similar observations.

3 7 7.8

4 8 9.5

3 Physically based patterns: patterns that were

derived by means of simpli¬ed theory that is

We see that for m = 1 the expected Mahalanobis

appropriate to the hypothesis the experiment

distance is larger than the critical value; usually

is designed to test.

the null hypothesis will correctly be rejected.

However, as more components that contain only It is often more pro¬table to invest in choices

noise are included, the chances of detecting the 2 and 3, which provide patterns with a physical

signal deteriorate. basis, rather than to try to improve the power of

the statistical tests. These choices also provide

6.5.5 Practical Problems. A practical problem con¬rmation that the physical reasoning that

arises in multivariate difference of means tests leads to the experimental design and choice of

because the covariance matrix is generally not patterns is correct. For example, if empirical guess

known. The problem was avoided in the previous patterns are derived from observations on the

example because Σ was speci¬ed. Consequently, basis of physical reasoning (choice 2) and the

we were able to use D2 (Y) (6.8) as the test null hypothesis that their ˜experimental™ treatment

statistic. In most problems, though, Σ must be does not induce a climate signal is rejected, then

estimated. One implication is that we must base there is statistical con¬rmation that the GCM has

the test on the Hotelling T 2 statistic, which is reproduced these aspects of the observed climate.

the counterpart to D2 if Σ is replaced with the If dynamically derived patterns are used (choice

sample covariance matrix. To compute T 2 we must 3), rejection is an indication that the simpli¬ed

be able to invert the sample covariance matrix, theory behind the guess patterns operates within

which means that we need to have a sample of the GCM, at least to a ¬rst order of approximation.

n = m + 1 realizations of the climate represented Examples are presented in Sections 6.9, 6.10 and

by X. However, in most climate applications, there Chapter 7.

are many more spatial degrees of freedom than

6.5.7 Optimizing the Signal-to-Noise Ratio.

observations (i.e., n m). Then, reducing the

Hasselmann [166, 168] suggested the following

number of spatial degrees of freedom by restricting

interesting way to construct an optimal guess

the test to a subspace that is thought to contain the

pattern p o from a given guess pattern p .

signal of interest is also a practical expedient.

Let X be a random vector of dimension m with

covariance matrix Σ and expectation µ X . Let Y

4 I denotes the m — m identity matrix.

6.6: Tests of the Mean 111

An example of an application of this optimiza-

be another m-dimensional random vector with the

same covariance matrix and expectation µY = tion procedure (Hegerl et al. [172]) is given in

µ X . Next, let p be a guess pattern representing the some detail in Section 7.4. Other applications in-

anticipated form of the true signal = µY ’ µ X . clude Bell [37, 39], Mikolajewicz, Maier-Reimer,

and Barnett [277] and Hannosch¨ ck and Frankig-

o

This pattern will not point in exactly the same

noul [161].

direction as , but we will act as if p were the

true signal. Then the challenge is to ¬nd an optimal

guess pattern p o that maximizes the likelihood of

6.5.8 Hierarchies. When an extended set of

signal detection.

guess patterns is available, step-wise test proce-

To do so we consider the signal-to-noise ratio

dures are also possible within the multivariate

p,po 2

testing paradigm discussed in this section.

r= , (6.10)

For example, suppose a set of guess patterns

Var Y ’ X, p o

= {p i : i ∈ I}

contains a subset of patterns

where ·, · denotes the scalar, or dot, product of that are physically derived (choices 2 and 3 in

[6.5.6]). Here I is a collection of indices. We call

two vectors. The numerator in (6.10) is the strength

the low-dimensional space , which is spanned

of the (anticipated) signal in the direction of the

by , the ˜signal space.™ The space spanned by the

optimal guess pattern p o . The denominator is the

full collection of guess patterns is then given by

variance of the noise, Y’ X, in the direction of p o .

the full set of patterns that are likely to contain the

When r is large, the likelihood of rejecting the null

∪ ⊥ where ⊥ is the

sought after signal =

hypothesis H0 : µY ’ µ X = 0, and thus detecting

space spanned by the guess patterns that are not

a nonzero signal in the direction of p , is also

contained in . The full response, say Z = Y ’ X,

large.

is then written as Z = Z + Z⊥ . The components

We now specify p o . Because r does not depend

on p o we may constrain p o so that parallel and perpendicular to the signal space, Z

and Z⊥ , are then tested. The parallel component

p,po = 1.

2

(6.11) is projected on the problem-speci¬c guess patterns

contained in and the perpendicular component

Then r may be maximized by minimizing the

is tested using problem-independent guess patterns

denominator of (6.10),

(choice 1 in [6.5.6]) such as EOFs.

Var Y ’ X, p o = 2( p o )T Σ p o . An example is given in Section 7.2.

(6.12)

The approach discussed above imposes a simple

The guess pattern that minimizes (6.12) satis¬es ordering on a set of guess patterns: the full set

of patterns that are likely to contain the sought

d

o 2( p ) Σ p ’ ν p,po ’1 = 0,

oT 2

o after signal and a smaller subset of patterns derived