state that the con¬dence interval will contain the observation becomes available. Then we might be

unknown parameter p — 100% of the time. We interested in using the information in the ¬rst n

˜

have found this language to be a great source of observations to predict an interval

confusion because it somehow implies that the (X [X , . . . , X ], X [X , . . . , X ])

L 1 n U 1 n

parameter ± is random. Rather, it is the endpoints

9 This type of interval estimator is suitable when a

of the con¬dence interval that are random; they

vary from one realization of the sample to the regression equation is used to specify the value of an unknown

next. Note that, conditional upon a particular dependent variable (see Chapter 8). A typical application

in climatology and meteorology is a statistical forecast

sample, everything about the con¬dence interval improvement procedure in which forecasts from a numerical

is ¬xed (both the endpoints and parameter ±) and, weather forecast are enhanced using regression equations.

5.4: Interval Estimators 91

sampling. Depending on the situation, A denotes

either a ¬xed parameter or a random variable.

The de¬nition of p (X1 , . . . , Xn ) depends on the

˜

assumed statistical model (e.g., the sample can

be represented by iid normal random variables),

the nature of the target (i.e., either a parameter

or a random variable), and the con¬dence level

˜

p.

For the moment we limit ourselves to univariate

problems (and thus intervals) instead of the more

general multivariate problems (which require the

use of multi-dimensional con¬dence regions).

Multivariate problems arise in the context of

-2 0 2

regression analysis (see Chapter 8), for example.

As with point estimators, there are various ways

Figure 5.5: Ten realizations of a 95% con¬dence to derive interval estimators. The only condition

interval for a random variable X. On average, 19 that must be satis¬ed is (5.40). Other reasonable

out of 20 intervals will cover the next realization requirements are that the set p (X1 , . . . , Xn ) has

˜

of X. The curve shows the density function of X. minimum size, on average, and that it is compact.

The latter implies, in the univariate case, that

con¬dence regions can only be intervals.

that will cover Xn+1 p — 100% of the time. This

˜

If the target is a parameter, the general

is a con¬dence interval for a random variable (see

procedure is as follows. We start with an ef¬cient

Figure 5.5). The random intervals are now wider

estimator ± of parameter ±. We then derive the

than they were in Figure 5.4 because they need to

distribution of ±. This distribution will depend

be able to cover a moving, rather than ¬xed, target.

on ± somehow. There will generally be a way

Again note that the con¬dence level refers to the

to transform ± so that the distribution of the

average behaviour of the interval

transformed variable no longer depends on ±. For

example, if ± is a location parameter such as a

(X L [X1 , . . . , Xn ], XU [X1 , . . . , Xn ])

mean, then the distribution of Z = ± ’ ± will not

depend upon ±. Similarly, if ± is a scale parameter

in relation to the unknown random variable Xn+1 .

=

The interval is constructed so that in repeated such as a variance, then the distribution of

sampling of X1 , . . . , Xn , Xn+1 the probability of ±/± will not depend on ±. The distribution of the

coverage is transformed variable is then used to construct the

con¬dence interval.

P(X L [X1 , . . . , Xn ] < Xn+1

For a location parameter we ¬nd critical values

< XU [X1 , . . . , Xn ]) = p.

˜ z L and zU so that P (z L ¤ Z) = 1 ’ p/2 and ˜

P (Z ≥ zU ) = 1 ’ p/2. Therefore, in repeated

˜

Note, that if we condition on the observed values

sampling,

x1 , . . . , xn of X1 , . . . , Xn and continue to think

of Xn+1 as random, then the coverage of the p = P (z L < Z < zU )

˜

˜

interval is no longer exactly p. However, in most

= P (z L < ± ’ ± < zU )

practical applications the coverage will be close

= P (’zU < ± ’ ± < ’z L )

˜

to p because n will be relatively large. That is, we

= P (± ’ zU < ± < ± ’ z L ).

do not expect the upper and lower bounds of the (5.41)

interval to move a great deal due to variation in

Thus, the p —100% con¬dence interval for location

˜

X1 , . . . , Xn .

parameter ± has the form ± ’ zU < ± < ± ’ z L .

Note that it is centred on estimator ± and that it

5.4.3 Constructing Con¬dence Intervals. In excludes equal proportions of the upper and lower

general, a con¬dence region is de¬ned indirectly tails of the distribution of ±.

as a set p (X1 , . . . , Xn ) such that

˜ For a scale parameter, we ¬nd critical values

L and U so that P ( L ¤ ) = 1 ’ p/2 and˜

P p (X1 , . . . , Xn ) A = p. ˜ (5.40)

˜

P ( ≥ U ) = 1 ’ p/2. Both critical values will

˜

That is, p (X1 , . . . , Xn ) is constructed so that be positive because we are dealing with a scale

˜

it covers A p — 100% of the time in repeated parameter. Also, for large values of p, L will

˜ ˜

5: Estimation

92

5.4.4 Con¬dence Intervals for the Mean. Let

be less than 1 and U will be greater than 1. We

X1 , . . . , Xn represent a sample of iid normal

expect that in repeated sampling

random variables with mean µ and variance σ 2 .

p = P(

˜ < < U)

L Then

±

√

=P < <U

L

± Z= n(X ’ µ)/σ (5.43)

±

1 1

=P <<

has the standard normal distribution N (0, 1) (see

±

U L

± ± [4.3.3]). The quantiles of this distribution are

=P <±< . (5.42) tabulated in Appendix D. Using the Appendix, we

U L

¬nd critical values zl and zU for a given con¬dence

Thus, the p — 100% con¬dence interval for the level such that10

˜

scale parameter ± has the form ± < ± < U

± . This will generally be an asymmetric interval p = P (z L < Z < zU ).

˜

L

about ± because the sampling distributions of

The shortest p — 100% con¬dence interval is

˜

scale parameters are usually skewed. None the

obtained by choosing zU so that P (Z < zU ) =

less, the interval has been constructed to exclude

0.5 + p/2 and selecting z L = ’zU . Then

˜

equal portions of the lower and upper tails of the

substituting (5.43) for Z and manipulating as in

distribution of ±.