-2

••• • • ••

• •• • •• • ••••

••• • •• • •• • • • • ••• •• ••

•

•••••••• •••••••• •• ••• •••••• ••••• ••• •••• ••••••• •••• •••••••••••••• •••••••• ••••• • • •••• ••••••••• • •• •• •• • •••••• • ••

• ••

• ••

•

1

• • •• • • • • • • • •

• • • • •• • ••

••••••••••• ••• ••••• •• ••••••••••••• ••••••••••••••••••••••••••••••••••••••• ••••••• ••••••• • •••••••• • •••••••• •• •• •• ••••• ••••• •••

•

•

•• •• • •

• • • •• • •• • •••• ••••••• • ••• •••••• ••• •• •••• • • •••••••••• ••••• • ••• • •• • • •••• • •• •• ••• •••• • •

• •

• • • • •• • •• •

•• • • • • ••••

••••• • • • ••• •

••

•• • • •• • • • • • • • • • • • • • ••••

• •

•

0

-3 -2 -1 0 1 2 3

1940 1950 1960 1970 1980

Quantiles of Standard Normal

Date

Figure 8.11: Scatter plots illustrating the ¬t of the Figure 8.12: A probability plot of the standardized

regression of the SO index on the SST index. This quantiles of the residuals from the regression of the

example is introduced in [8.1.3]. Three outliers, SO index on the SST index, against the quantiles of

occurring in March 1961, February 1978 and the standard normal distribution.

February 1983 can be identi¬ed. In the upper

panel the absolute standardized residuals are

• When the errors Ei are elements of a station-

plotted against the estimated conditional mean.

ary time series, the least squares estimators

They are plotted against time in the lower panel.

are still, under relatively broad conditions,

asymptotically the best (i.e., minimum vari-

ance) linear unbiased estimators of the regres-

corner of the graph. In general, these residuals are sion parameters (see [323, pp. 588“595]).

acceptably close to being normally distributed.

However, be aware that even minor departures

from the normal distribution assumption can have

8.3.15 Why Use Least Squares? While we

a detrimental effect on inferences made about the

have, on occasion, warned that inferences made

error variance.

with least squares estimators may not be robust,

their widespread use is justi¬ed for more reasons

than just computational ease and the tractability 8.3.16 Diagnostics: Serial Correlation. While

of inference when errors are independent and the last item above reassures us that least squares

normally distributed. estimators can be consistent when errors are

dependent, it says nothing about the reliability

• As a consequence of the Gauss“Markov of inferences under dependence. Unfortunately,

Theorem (see [147, p. 219], or [197, p. the inference procedures outlined above are very

301]), least squares estimators of linear sensitive to departures from independence (see

model parameters have minimum variance Section 6.6; [62, p. 375]; and also [363], [442],

amongst all unbiased linear estimators as [454]).

long as the errors are independent and The Durbin“Watson statistic (see [104], [107],

identically distributed with zero mean and [108], and [109]), computed as

constant ¬nite variance. This is a relatively

n’1

strong reason to use least squares estimators,

i=1 (ei+1 ’ ei )

2

,

despite the insistence that the estimators be d = (8.24)

SSE

linear (i.e., that they be expressible as linear

combinations of the response variables Yi ), is commonly used to detect serial correlation.

because our ability to construct nonlinear When errors have positive serial correlation, the

estimators is limited. This property of least differences (ei+1 ’ ei )2 tend to be small compared

squares estimators does not persist if errors with those when errors are independent. Therefore

do not have constant variance (see, e.g., small values of d (near zero) indicate positive

Section 8.6, and [62, pp. 352“353]). serial correlation. When errors are independent,

8: Regression

158

is positive serial correlation) or more frequently

we see from (8.24) that

(negative serial correlation) than would be ex-

n’1

(ei+1 ’ ei )2 pected in a sequence of independent errors. The

d= i=1

SSE test statistic used in the runs test, denoted U, is the

n’1 2

i=1 (ei+1 + ei ’ 2ei+1 ei )

number of sign changes plus 1. Draper and Smith

2

= [104, pp. 160“161] give tabulated critical values

SSE

when the number of residuals of both signs is small

SSE + SSE ’ 0

≈ = 2. (¤ 10). A normal approximation can be used when

SSE

samples are large. It can be shown that the mean

Hence values of d near 2 are consistent with and variance of U under H0 are

independent errors. If the alternative hypothesis

is that the errors are negatively (rather than 2n 1 n 2

positively) correlated, then the test statistic should µU = n + n + 1

1 2

be 4 ’ d. 2n 1 n 2 (2n 1 n 2 ’ n 1 ’ n 2 )

σU = ,

2

Computation of the signi¬cance of the observed

(n 1 + n 2 )2 (n 1 + n 2 ’ 1)

d under the null hypothesis of independence is

somewhat involved. Durbin and Watson give a

range of critical values for samples of size n ¤

where n 1 and n 2 are the number of positive

100. The tabulated critical values consist of pairs

and negative residuals. Then H0 : no serial

d L and dU such that H0 can always be rejected if

correlation can be tested against Ha : positive

max(d, 4’d) < d L and H0 should not be rejected

serial correlation by comparing (U ’ µU +

if min(d, 4 ’ d) > d H . Between these limits, the 1

)/σ against the lower tail critical values of the

determination of whether or not d is signi¬cantly 2 U

standard normal distribution (Appendix D). Here

different from 2 depends on the speci¬c values

we are approximating a discrete distribution with

xi , for i = 1, . . . , n, taken by the independent

a continuous distribution; so the half that is added

variable. Durbin and Watson [108, 109] describe

is a continuity correction that accounts for this. For

an approximation to the distribution of d based

our SOI example, we have n 1 = 295 and n 2 = 329

on the beta distribution that can be used with

so that µU = 312.17 and σU = 12.44. We observe

moderate to large sample sizes when the test based

u = 307, a value that is not signi¬cantly different

on the tabulated values is inconclusive or when the

from µU .