These programs are very useful, however, because the coverage

decision making process that tries to determine the and consistency of their observing networks (in space and

truth of statements, called hypotheses, proposed time and also in terms of the observed variables) are greatly

before seeing the data. enhanced relative to the regular observing network.

69

4: Concepts in Statistical Inference

70

variables. Even if we have enough data to establish is another approach to con¬rmatory analysis that

a statistical link, we can not exclude the possibility is often satisfying.

that the repeated coincidence of two events is

caused by another non-observed process. Our data

coverage allows us to study only an open sub-

4.1.3 Con¬rmatory Analysis of Simulated Data.

system of the full system. In contrast, veri¬able,

The situation is different when dealing with data

˜con¬rmatory™ statements require closed systems

generated in simulations with GCMs since new

(for a discussion of this fundamental problem, see

additional data can be created, and experiments

Oreskes, Schrader-Frechette, and Beltz [301]).

can be designed to sort out different hypotheses.

All observational data re¬‚ect the same trajec-

However, climate models can not be completely

tory of the climate system during the past tens or

validated, which is a big limitation.3 The answers

hundreds of years. Certainly, there are many dif-

given by GCMs could simply be an artifact of the

ferent data sets, such as air pressure reported from

model.

land stations or sea-surface temperature reported

Experimentation with GCMs began in the

from ships of opportunity (see Chapter 3). These

1960s, when pioneers such as Manabe and

data sets differ somewhat even if they purport-

Bryan [265] examined the sensitivity of the climate

edly represent the same variable”say near-surface

to enhanced greenhouse gas concentrations. The

wind (see [3.1.5])”but these differences are due

standard methodology is to produce a pair of

to different observational, reporting and analysis

simulations that deviate from each other in

practices. They do not represent the kind of inde-

only one aspect (such as different greenhouse

pendent information about the climate system that

gas concentrations or sea-surface temperature

would be obtained by observing the same variables

regimes). This type of experiment is well designed

over a period of similar length at another point in

and can be used to con¬rm hypotheses derived

time (e.g., beginning two centuries ago). In other

from the observational record or other model

words, such data sets do not offer the option for

experiments. (See Chapter 7 for examples.)

con¬rmatory analysis.

This limitation has a severe consequence: Many

people, probably hundreds or thousands, have

used different techniques to screen our ˜one™ 4.1.4 Estimation of Parameters. In estimation,

observational record for rare events. Most of a sample of realizations of a random variable is

these ˜unusual™ results are eventually published used to try to infer the value of a parameter that

in articles in scienti¬c journals. Clearly, some of describes some property of the random variable.

these ˜unusual™ facets are due to peculiar and rare That is, a function of the observations is taken

circumstances that are, nevertheless, ˜usual,™” to be an educated guess of the true parameter

they are ˜Mexican Hats™ (to use an analogy from value. This educated guess, the estimator, is

Section 6.4) and can not be contested with a either a number (point estimator) or an interval

statistical test. We can identify an ˜unusual™ object (interval estimator). Ideally, the point estimate is

by comparing it with all others in the observational in the neighbourhood of the true value, and the

record. Thus the statement, or null hypothesis, neighbourhood becomes smaller with increasing

˜this object is not unusual™ cannot be contested sample size. Similarly, a good interval estimator

with a statistical test since independent data are uses the sample to select a range of parameter

unavailable. No statistical test, regardless of its values that is likely to contain the true parameter.

power or elegance, can overcome this problem, This interval is constructed to cover the true

although there are two possible solutions. The ¬rst parameter with a ¬xed, high probability (typically

is to extend the observational record backwards

by creating new paleo data sets,2 the second is

3 General Circulation Models are tuned to reproduce, to the

to postpone testing the developed theories until

extent possible, the statistics of the observational record of the

nature generates enough independent data. Using

last few decades. Success in this regard is not a guarantee that

suitably designed GCM experiments to test a the models can successfully simulate natural climate variability

hypothesis derived from the observational record on longer time scales. It is also not a guarantee that the models

will respond correctly to changes in, for example, the chemical

composition or turbidity of the atmosphere. See Oreskes et al.

2 Paleo data are data derived from indirect evidence, such [301].

as sediments, that are believed to be representative of the state However, GCMs are considered powerful tools for examining

of climatic components before the current short instrumental the sensitivity of the climate system since they are based, to a

period. large extent, on physically robust concepts.

4.1: General 71

95%) in repeated sampling.4 Thus, the length of manner (see [5.2.7]) and the estimates are

the interval decreases with increasing sample size. subsequently employed in a Canonical Correlation

Analysis (see Chapter 14). The patterns shown

Various ˜parameters™ are subject to estimation,

in Figure 1.13 are a best guess rather than

such as the conventional moments (see [2.6.7])

the true canonical correlation patterns. Note that

that characterize the probability distribution of the

these patterns represent simultaneous estimates of

observed random variable. However, estimation is

several hundred parameters.

not limited to such elementary parameters; one

may also want to estimate the entire probability

distribution, or more exotic parameters such as 4.1.6 Interval Estimators: An Example. We

the ˜level of recurrence™ of two random variables return to the example that deals with the

(Sections 6.9“6.10). The ˜random variable™ might correlation between the SOI and the SST based

really be a random ¬eld observed at m points and index of the Southern Oscillation [1.2.6]. In [8.2.3]

we might want to estimate the m 2 parameters that we impose a model on the bivariate random

variable, X = (I S O , I SST ), and then use it

comprise the ¬eld™s covariance matrix.

to construct an interval estimator (ρ L , ρ U ) for

As discussed in [1.2.1], there are no ˜right™ or

ρ SST,S O I . The estimator is designed so that the

˜wrong™ statements in the realm of estimation;

interval will cover the true value of ρ SST,S O I 19

rather, statements can only be considered in terms

of precision and reliability. There are some well- out of 20 times if the ˜experiment™ that resulted

de¬ned concepts that can be used, in principle, in the 1933 to 1984 segments of the SO and SST

to obtain estimators with desirable properties. indices is repeated in¬nitely often. Note that it

For example, the maximum likelihood method is the endpoints of the interval that vary from

can be used to construct estimators that are one replication of the experiment to the next: the

true value of ρ SST,S O I is ¬xed by the physical

˜asymptotically optimal™ under broad regularity

conditions (see [5.3.8]). However, the complexity mechanism that connects SST variations in the

that is often encountered in climatology causes the Equatorial Paci¬c with the Southern Oscillation.

design of estimators to be closer to art than sound Performing the computation with the observed

indices yields ρ L = 0.621 and ρ U = 0.708.

craftsmanship.

The con¬dence interval is therefore given by the

4.1.5 Point Estimation: Examples. A simple inequality

example of a point estimation exercise can be 0.621 < ρ SST,S O I < 0.708 . (4.1)

found at the end of [2.8.7] where we report

the estimated correlation between the standard Note that (4.1) does not include a probability

Southern Oscillation Index (SOI) and an SST statement about its correctness. In that sense,

index developed by Wright [426]. Here the sample the ˜con¬dence interval™ (4.1) really provides no

consists of the 624 realizations of the monthly ˜con¬dence.™

None the less, interval estimators are much more

mean SOI and the monthly mean SST index

observed between 1933 and 1984. The correlation useful than point estimators because they give con-

between corresponding random variables I S O and crete expression to the idea that the estimator is but

another random variable subject to sampling vari-

I SST is estimated to be

ation. Unfortunately, often in practice a con¬dence

ρ SST,S O I = 0.67. interval cannot be constructed. Then, an estimator

is often considered useful if it performs well

A more involved example of an estimation in some controlled laboratory setting, or returns

exercise is found in [1.2.6], where optimally ˜physically reasonable™ numbers or distributions.