for the sample from the process with (±1 , ±2 ) = This is con¬rmed by extracting more informa-

(0.9, ’0.8) and tion from the Monte Carlo experiment described

’0.384 in [12.2.5]. Each sample was used to estimate

1

Σ± = 0.061 the asymptotic standard errors of ±1 and ±2 with

ˆ2

’0.384 1

(12.13). The mean estimate is compared with the

for the sample from the process with (±1 , ±2 ) = actual variability of the 100 MLEs in the following

ˆ

(0.3, 0.3). Note that the elements of ± 2 , ±1 and ±2 , table.

have the same estimated variance.

The observed standard deviation of 100

Approximate 95% con¬dence regions for

ML parameter estimates compared with

(±1 , ±2 ) can then be derived from these estimates

the mean of 100 asymptotic estimates

as follows. We assume that

(±1 , ±2 ) = (0.9, ’0.8)

± 2 ∼ N (±2 , Σ± ).

ˆ2

Observed std. dev. Mean

±1 ±2 estimate

T

Consequently

15 0.22 0.18 0.16

X = (± 2 ’ ±2 )T Σ’1 (± 2 ’ ±2 ) ∼ χ 2 (2).

ˆ 60 0.087 0.078 0.080

±2

240 0.044 0.042 0.039

Thus an approximate 95% con¬dence region is

obtained by replacing Σ± with Σ± and solving

ˆ ˆ (± , ± ) = (0.3, 0.3)

2 2 1 2

Observed std. dev. Mean

’1

(± 2 ’ ±2 )T Σ± (± 2 ’ ±2 ) = X0.95 ±1 ±2

ˆ estimate

T

2

15 0.32 0.23 0.26

where X0.95 is the 95% critical value of χ 2 (2)

60 0.12 0.10 0.13

distribution (see Appendix E).

240 0.070 0.062 0.062

The resulting con¬dence regions are shown in

Figure 12.4. It is reasonable to believe that these Only one column is used to describe the mean

4 This is usually provided by the ML estimation software, asymptotic estimate of standard error since the

diagonal elements of Σ± are equal. The table

ˆ

but is usually also closely approximated by (12.7).

2

12: Estimating Covariance Functions and Spectra

260

approximately χ 2 (K ’ p) when the correct model

shows that the large-sample theory standard

error estimator performs surprisingly well even has been selected, T is moderate to large, and

when samples are quite small.5 Comparable K is of moderate size relative to T . Statistical

performance can be expected when (12.13) is packages such as S-Plus [78] sometimes plot

P (Q(k) > q(k)|H0 ) against k for moderate values

applied to Yule“Walker parameter estimates.

of k as a diagnostic aid. Lack-of-¬t is indicated

when these ˜ p-values™ fall to near zero at some lag.

12.2.8 Model Diagnostics. We have now

Are there hidden periodicities in the residuals

tentatively identi¬ed an AR model, estimated

zt ? Truly periodic behaviour is sometimes dif¬cult

its parameters and perhaps also constructed an

to detect in plots of the time series and the resid-

estimate of the uncertainty of the parameters with

uals, although a plot of the normalized cumulative

(12.13). The next step is to determine whether the

periodogram as a function of frequency is often

model ¬ts well. We give a very brief sketch here of

able to reveal such behaviour. The periodogram

a few of the ideas involved. Box and Jenkins [60]

(cf. [11.2.0] and Section 12.3) is the squared

cover this topic in much more depth.

modulus of the Fourier transform (11.16) of the

As with regression diagnostics (cf. [8.3.12“16]),

residuals

it is important to plot the time series itself and to

plot the estimate of the noise process T 2

2

I (ω j ) = zt cos(2πω j t)

zt = xt ’ ±1 xt’1 ’ · · · ’ ± p xt’p T’p

(12.14)

t= p+1

t = p + 1, . . . , T. T 2

+ zt sin(2πω j t) (12.16)

These plots should be examined for trends,

t= p+1

periodicities, outliers, and other evidence that the

weak stationarity assumption has been violated. computed at frequencies ω j = j/(T ’ p), j =

It is also useful to over¬t the model. If it is 1, . . . , (T ’ p ’ 1)/2.6

possible to reduce substantially the estimated error The normalized cumulative periodogram is

variance σZ (12.7) or increase substantially the

2

computed from I (ω j ) as

log-likelihood (12.9) by adding additional lagged

terms to the AR model, then a higher-order model j

1

Q(ω j ) = I (ωi ),

should be considered.

(T ’ p)σZ

2

The residuals zt (12.14) should be examined i=1

to check that they behave as white noise. They

where σZ is the estimated variance (12.7) of

2

will not, of course, do so exactly because the

the forcing noise. When the correct model has

residuals will only be asymptotically independent

been chosen we expect Q(ω j ) to increase linearly

of one another, even when the correct model

from 0 to 1 with increasing ω j .7 Departures from

has been selected. None the less, it is useful to

linearity indicate either the presence of discrete

compute and plot the auto-correlation function of

periodic behaviour in zt (and hence Zt ) that can

the estimated noise process. The standard errors

not be captured by an AR model, or the presence of

of √these auto-correlations will be approximately

quasi-periodic behaviour that cannot be captured

1/ T at large lags.

by the chosen model. In the latter case, a higher-

It is also sometimes useful to compute a

order model may be indicated.

portmanteau lack-of-¬t statistic such as

When zt is exactly white noise,8 then

K

2

K = max max[|Q(ω j )’2ω j |, |Q(ω j )’2ω j’1 |]

Q(K ) = (T ’ p) rzz („ ) (12.15)

ˆˆ

j

„ =1

(12.17)

to diagnose whether the ¬rst K lags of the

auto-correlation function of the residuals jointly has the same distribution as the Kolmogorov-

estimate the zero function. Note that p is the Smirnov goodness-of-¬t statistic (see [5.2.3]) for

order of the ¬tted model and rzz („ ) is the the case in which the distribution is fully speci¬ed

ˆˆ

auto-correlation function of the estimated noise

6 We have assumed, for convenience, that T ’ p is odd.

process. It has been shown that Q is distributed 7 We will see in [12.3.6] that (12.16) is an estimate of the

5 Note, however, that the Monte Carlo experiment is autospectrum of zt . When zt is white, the expected value of

I (ω j ) is σZ for all j.