Absolute Standardized Residual

• •

0.0 0.5 1.0 1.5 2.0 2.5

0.0 1.0 2.0 3.0

•• •

•

• •• •

• • ••

•

•

•

••

••

• • • ••

• • •••

• • •

••

•

• • •

• •

• •

••

•

•

• •

•

• •• •• •

•• •

•• •

• ••

• •••• •

•• •• •

••

• ••

•• •• • • ••• • • • ••• • ••

• •• •• • •

•• • • •••• • •

••• • •

• ••

• •• •• • • • •• • • ••• • •• •

•• •• • • ••

• ••

••• •

•• • • •• • •• ••

•• • •• • •

•• • •• •

• •• ••

•

• •• • • •• •

•• •

1.02 1.04 1.06 1.08 1.10 1.03 1.04 1.05 1.06

Fitted Line Fitted Line

Figure 8.9: A pair of scatter plots illustrating Figure 8.10: A scatter plot illustrating data

generated from Y = 1 + 0.1x + E where E ∼

heteroscedasticity. The data were generated from

Y = 1 + 0.1x + x(1 ’ x)E, where E ∼ N (0, 0.12 ). N (0, 0.052 ). Two outliers have been inserted by

setting the realizations of E at x = 0.5 and x =

The upper panel shows 100 simulated data points

0.95 to 0.15 and ’0.15 respectively.

and the line ¬tted by least squares. The lower panel

displays the absolute standardized residuals as a

function of the ¬tted line.

regression model are invalid. None the less, once it

has been determined that the model ¬ts the data

that at x = 0.95 is hidden, for reasons discussed in

reasonably well, it is still useful to examine the

[8.3.18].

residuals to see if there are gross departures from

Studentized residuals, rather than standardized the normal distribution assumption, which might

residuals, are often used in diagnostic plots. A compromise the inferences. A useful diagnostic for

studentized residual is obtained at point xi by this purpose is a normal probability plot 5 of the

¬tting the regression model without the data (ordered) standardized residuals e(i|n) /σ E against

pair (xi , yi ), computing the difference between the ((i ’ 0.5)/n)-quantiles of the standard normal

yi and the estimate obtained from the ¬t, and distribution.

¬nally dividing this deviation by the estimate of

As discussed in [3.1.3] and [4.2.2], such plots

the standard error obtained from the ¬t. Outliers

are constructed by plotting the points

hidden in ordinary residual plots often become

apparent in plots of studentized residuals because

i ’ 0.5 e(i|n)

they do not affect the ¬t of the model used to

’1

, for i = 1, . . . , n.

FN

estimate the studentized residual. Unfortunately, σE

n

studentized residuals fail to identify the hidden

outlier in Figure 8.10.

The points will lie on an approximately straight

Diagnostic scatter plots of the residuals from the

line sloping upwards at a 45—¦ angle when the

¬tted regression of the SO index on Wright™s SST

residuals are approximately normal with variance

index are displayed in Figure 8.11. No evidence

σE .

2

of heteroscedasticity or systematic departure from

The probability plot for our SOI example is

the ¬tted line is apparent. However, three outliers

shown in Figure 8.12. We see that the central

can be observed, all of which are positive.

body of the distribution is very close to normal.

Only one deviation (occurring in February 1983)

The diagram shows that the left hand tail of

corresponds to a known El Ni˜ o warm event.

n

the distribution is slightly narrower than that of

a normal distribution and the right hand tail is

8.3.14 Diagnostics: Probability Plots. As will

slightly wider. The three outliers we identi¬ed

be discussed in [8.3.15], skewness of the residuals

previously can be seen at the upper right hand

(e.g., a tendency for there to be more residuals

of one sign than another) should not immediately

5 Sometimes also called qq plots, or quantile“quantile plots.

lead to the conclusion that all inferences about the

8.3: Fitting and Diagnosing Simple Regression Models 157

Fit of SOI = a + b SST + noise

Linear fit of SOI = a + b SST + noise

Absolute Standardized Residual

•

4

•