3

AI C p = ’2l(± p , σZ |xT ) + 2( p + 1)

2

≈ T ln(σZ2 ) + 2( p + 1)

1

where σZ2 is the estimated noise variance (12.7).

0

-1

In effect, the maximum log-likelihood obtained

-2

by ¬tting a model of order p is penalized by

subtracting the number of parameters that were

0 50 100 150 200

¬tted. The order is chosen to be that which

minimizes AI C p .

ACF Plot of Residuals

1.0

A heuristic way to understand how AIC works

is as follows. Suppose we have ¬tted a model of

0.5

order p + q and want to test the null hypothesis

H0 : ± p+1 = · · · = ± p+q = 0, that the last q AR

ACF

0.0

parameters are zero. H0 can be tested with the

likelihood ratio statistic

-0.5

2δl = 2l(± p+q , σZ |xT ) ’ 2l(± p , σZ |xT ),

0 5 10 15 20

Lag

which is asymptotically distributed χ 2 (q) under

the null hypothesis. Thus E(2δl) ≈ 2q when H0 is

Figure 12.6: As Figure 12.5 except these plots true. That is, if the true order of the AR process

diagnose the lack-of-¬t of an AR(1) model to the is no greater than p, then the expected change

AR(2) time series. between the log-likelihood of an AR( p) model

and an AR( p + q) model will be about q. The

penalty compensates for this apparent increase in

1.0

the log-likelihood.

••••••••••••

•••••••••••••••••••••••••••

•••••••••••••••••

•••• However, the argument above also reveals a

•••••••••

•••

••

0.8

dif¬culty with the AIC that has been pointed

•••

out in the literature (see, e.g., Jones [204];

•

•

0.6

Katz [216]; Hurvich and Tsai [191]); the AIC

determined order is an inconsistent estimate of

•

0.4

the order of the process. Note that the variance

of δl, and hence of the AIC, does not decrease

••

•

with increasing sample size. Consequently, the

0.2

••

••

• sampling variability of the AIC determined order

••••

•••

••••••• will not decrease with increasing sample size. In

•••••••••••••••••••

0.0

fact, AIC tends to overestimate the order of the

0.0 0.1 0.2 0.3 0.4 0.5

Frequency process somewhat. However, these problems are

not serious in practice.

The following table gives AICs for AR models

Figure 12.7: The cumulative periodogram of the of order 0“5 ¬tted with the Yule“Walker method

residuals obtained by ¬tting an AR(1) model to a to our time series of length 240 generated from the

time series of length 240 generated from an AR(2) AR(2) process with (± , ± ) = (0.9, ’0.8).

12

process with (±1 , ±2 ) = (0.9, ’0.8). The dashed

lines indicate 5% critical values for testing that the p versus AI C p

residuals are white.

p 0 1 2 3 4 5

AI C p 313 250 23.9 25.8 26.5 28.5

used to circumvent this problem (see, e.g., Katz The minimum AIC is indeed achieved by a model

[217]; Chu and Katz [85, 86]; Zwiers and von of the correct order. The AIC is large for models

Storch [453]; Zheng, Basher, and Thompson of order less than 2 and increases slowly with p

[437]). for models of order greater than 2.

Two order determining criteria are commonly We repeated this exercise 1000 times with time

used. The Akaike information criterion (AIC; see series of each length 60, 120, and 240 generated

12.3: Estimating the Spectrum 263

from AR(2) processes with (±1 , ±2 ) = (0.9, ’0.8) differences are beyond the scope of this book.