X = x is given by less certain than that of a0 and a1 . However,

we can safely infer that all three parameters are

µY|X=x = xT a, signi¬cantly different from zero. We should add

the caveat that these inferences are valid only if

which is estimated by

our assumptions about the errors (i.e., that they are

µY|X=x = x a.T iid normal) hold.

The parameter estimators al are seldom inde-

pendent because (X TX )’1 is seldom a diagonal

Property 3 of [8.4.2] tells us that

matrix. Therefore multiple p — 100% con¬dence

˜

µY|X=x ∼ N (xT a, σE xT (X TX )’1 x).

2

intervals for, say, m different parameters do not

Using properties 4 and 5 of [8.4.2] we obtain that constitute a joint p — 100% con¬dence region for

˜m

the m parameters taken as a group (see [8.4.7]).6

µY|X=x ’ µY|X=x Property 3 of [8.4.2] tells us that the covariance

T= ∼ t(d fE ).

matrix of a can be estimated with σ E (X TX )’1 .

2

σ E xT (X TX )’1 x

The estimates for our example are:

Thus a p — 100% con¬dence interval for the

˜

Correlation a0 a1 a2

conditional mean at x has bounds

’0.532 ’0.135

a0 1.000

µY|X=x ± t(1+˜ )/2 σ E xT (X TX )’1 x, (8.33) ’0.532 ’0.731

p a1 1.000

’0.135 ’0.731

a2 1.000

where t(1+˜ )/2 is the appropriate quantile of the t

p

distribution with d fE degrees of freedom obtained

8.4.7 Joint Con¬dence Regions for More Than

from Appendix F. As for simple linear regression,

One Parameter. A joint p — 100% con¬dence

˜

the true response surface (a plane) will be covered

region for p parameters al1 , . . . , al p can be

by the range of hyper-surfaces described by this

expression p — 100% of the time.

˜ obtained as follows.

First, let U be the (k + 1) — p matrix that has

8.4.5 A Con¬dence Interval for the Response el j , where el j is given by (8.34), in column j, for

s

Variable. As with simple linear regression, a p — j = 1, . . . , p. Then the vector a = U a contains

T

˜

100% con¬dence interval for the response variable the p parameters of interest and is estimated by

Y at X = x is obtained by adding 1 to the quantity a = U a. Using Property 3 of [8.4.2] we see that

T

s

the estimator has a normal distribution given by

under the radical sign in (8.33).

a s ∼ N (U T a, σE U T (X TX )’1 U)

2

8.4.6 A Con¬dence Interval for Parameter al .

Let el be the (k + 1)-dimensional vector (see [2.8.9]). Now let

el = (δl,0 , δl,1 , . . . , δl,k )T V = [U T (X TX )’1 U]’1/2

(8.34)

where δl j = 1 if l = j and δl j = 0 otherwise. The so that V T V = [U T (X TX )’1 U]’1 , and de¬ne Z to

p — 100% con¬dence interval for al is obtained by be the p-dimensional normal random vector

˜

substituting el for x in (8.33).

The matrix (X TX )’1 for the Landsat data ¬tted Z = V(a s ’ as ).

with model (8.32) is

6 This type of rectangular region in parameter space is also

«

0.1714 ’0.0649 ’0.0371 not a good way to construct a joint con¬dence region when

’0.0649 .

0.0842 ’0.1409 (8.35) estimators are independent. Construction of a con¬dence region

should use the principle that any point in parameter space

’0.0371 ’0.1409 0.4416 outside the con¬dence region should be less likely given the

data than points inside the con¬dence region. For iid normal

Therefore, the 95% con¬dence intervals for the data this means that the boundaries of con¬dence regions

estimated parameters are should be ellipsoids. See [6.2.2] and Figure 6.16.

8.4: Multiple Regression 163

Then

Cloud Cover Coefficient

1.5

Z ∼ N (0, σE I),

2

•

1.0

where I is the p — p identity matrix (see [2.8.9]).

Therefore

0.5

ps s

(a ’ as )T V T V(a ’ as ) ∼ χ 2 ( p).

σE

2

0.0

We now have the ingredients needed to

0.0 0.5 1.0 1.5

construct a simultaneous con¬dence region for

Optical Depth Coefficient

parameters al1 , . . . , al p . By Properties 4 and 5

of [8.4.2], the χ 2 ( p) random variable above

is independent of the χ 2 (d fE ) random variable Figure 8.13: The joint 95% con¬dence region for

d fE SSE . Therefore, from (2.29), we see that the ln(„ ) and Ac coef¬cients of model (8.32).

σE

2

The estimated coef¬cients are indicated by the

s ’ as T )V T V(a s ’ as )

(a dot. The dashed lines indicate the individual 95%

∼ F( p, d fE ). con¬dence intervals computed as in (8.33).

σ2

E

Thus the p — 100% con¬dence region, an ellipsoid,

˜

is composed of all points in the (k+1)-dimensional 8.4.8 Is There a Regression Relationship?

This question is answered by testing the null

parameter space that satisfy the inequality

hypothesis H0 : a1 = . . . = ak = 0. We could

sT T s

(a s ’ a ) V V(a s ’ a ) proceed as we did above when constructing the

< Fp , (8.36) joint con¬dence region by constructing a suitable

˜

σE 2

kernel matrix V T V and then developing a test

˜

where Fp is the p-quantile of the F distribution statistic of the form