and because it is a measure of the extent to which

X determines Y. The adjective multiple is added

n

S XX = (xi ’ x)2 because in multiple regression (Section 8.4) this

number is a measure of the extent to which all

i=1

n variables on the right hand side of the regression

= xi ’ nx2 .

2

equation determine Y. While a useful diagnostic,

i=1 it is just one of several tools which should be used

to assess the utility and goodness-of-¬t of a model.

The sum of squared errors can be expressed R 2 is discussed further in [8.3.12]. Additional

similarly as diagnostic tools are discussed in [8.3.13,14,16,18]

and [8.4.11].

SSE = SY Y ’ a 1 S XY ,

In our SOI example, R 2 = 0.445, meaning that

somewhat less then one-half of the total variability

where

in the SO index is represented by the SST index.

n

SY Y = (yi ’ y)2 This is clearly in agreement with Figure 8.1 where

we see quite a bit of scatter about the ¬tted line.

i=1

n

= yi ’ ny2 .

2

8.3.4 The Relationship Between Least Squares

i=1

and Maximum Likelihood Estimators. When

the random variables Ei (8.10) are independent

SY Y is often called the total sum of squares and and identically normally distributed, it is easy

denoted SST . Be aware of the potential confusion to demonstrate that the least squares estimators

here between the common climatological practice are also maximum likelihood estimators. Under

of referring to sea-surface temperature as SST and these conditions, the log-likelihood function

the equally common statistical practice of referring l(a0 , a1 |xi , yi ), for i = 1, . . . , n, is given by

to the total sum of squares as SST . The quantity

’2l(a0 , a1 |xi , yi ) = n log (2π σE )

2

SSR = a 1 S XY n

1

+2 (yi ’ a0 ’ a1 xi )2 .

is often called the sum of squares due to regression σE i=1

and denoted SSR. It is easily veri¬ed that SSR =

8: Regression

152

8.3.8 Tests of the Slope Parameter. The null

The likelihood estimators are chosen to max-

—

hypothesis that a1 has a particular value, say a1 ,

imize the likelihood, or equivalently the log-

likelihood, of the estimated errors yi ’ a0 ’ a1 xi . can be tested by comparing

Maximizing the log-likelihood with respect to a0

—

a 1 ’ a1

and a1 results in precisely the least squares estima-

T= √

tors. This means that least squares estimators have σ E / SXX

the optimality properties of maximum likelihood

estimators (Section 5.3) when the normal distribu- against critical values from the t distribution with

tional assumption is satis¬ed. n ’ 2 degrees of freedom. It is often of interest to

know whether or not a1 is signi¬cantly different

8.3.5 Properties. While the estimators (8.16), from zero, that is, whether or not there is a

(8.17), and (8.18) have been written in their regression relationship between X and Y.

To test H0 : a1 = 0 against Ha : a1 = 0 in our

realized forms, they can also be considered as

random variables whose distribution is conditional SOI example, we compute

on the realized values of X. We will brie¬‚y state

the distributional properties of these estimators. a1

t= √

The derivation of these properties is discussed in σ E / SXX

[8.3.20]. 0.15

= = 22.4 .

√

1 a 0 , a 1 , and σ E are unbiased estimators of a0 , 12.2/ 3.320 — 106

2

a1 , and σE respectively.

2

This realized value of T is compared with

2 σ E is independent of a 0 and a 1 .

2

critical values from t(622) and is found to be

signi¬cant at much less than the 0.1% level. The

3 (n ’ 2)σ E /σE ∼ χ 2 (n ’ 2).

2 2

effect of dependence between observations is,

generally, to increase the frequency with which

4 a 1 ∼ N (a1 , (σE /SXX )2 ).

2

the null hypothesis is rejected when it is true,

n

5 a 0 ∼ N (a0 , (σE i=1 xi 2 /(nS ))2 ).

2

that is, to decrease the apparent signi¬cance

XX

level. Here it is certain that H0 is false, but

8.3.6 Inferential Methods. The distributional often when the evidence is more equivocal, it is

properties stated above provide a number of important to consider the effects of dependence

inferential results that are useful for interpreting (see Section 6.6).

Another approach to testing whether or not

a ¬tted regression model. Bear in mind, however,

that inferences made in the following way may a regression relationship exists is based on the

be compromised if the assumptions embedded in observation that, when a1 = 0, the regression sum

the procedures are violated. See [8.3.17] for more of squares SSR is an unbiased estimator of the

error variance which is distributed χ 2 (1) and is

discussion about this.

independent of σ E . (These results can be proved

2

8.3.7 A Con¬dence Interval for the Slope using methods similar to that in [8.3.20].) Since

Parameter. A p — 100% con¬dence interval for (n ’ 2)σ E /σE is distributed χ (n ’ 2), we obtain

2 2 2

˜

that

the slope of the regression line, a1 , is given by

t(1+˜ )/2 σ E t(1+˜ )/2 σ E SSR

p p

a1 ’ , a1 + √ ,

√ F= ∼ F(1, n ’ 2)

SXX SXX σE

2

where t(1+˜ )/2 is the ((1 + p)/2)-quantile of the

˜

p

t distribution with n ’ 2 degrees of freedom (see under the null hypothesis. Thus the test can be

conducted by comparing F with critical values

Appendix F).

In our SOI example n ’ 2 = 622, SXX = from Appendix G.

3.320 — 106 and σ E = 12.2. Therefore, assuming Because we have ¬tted a linear model that

that there is no dependence between observations depends upon only one factor, the t and F tests

are equivalent. In fact, F = T2 , and the square of

(an assumption we know to be false), the 95%

a t random variable with n ’ 2 df is distributed

con¬dence interval for the slope of the ¬tted line

is (0.137, 0.163). However, dependence between as F(1, n ’ 2). Thus identical decisions are made

observations causes the actual 95% con¬dence provided that the t test is conducted as a two-sided

interval for a1 to be wider. test.

8.3: Fitting and Diagnosing Simple Regression Models 153

8.3.9 Inferences About the Intercept. A p — ˜ SO and Tropical Pacific SST Indices

100% con¬dence interval for the intercept of the •

regression line, a0 , has bounds given by •