14.1: De¬nition of Canonical Correlation Patterns 321

Discarding the high-index EOFs can reduce the by substituting the two equations into each other to

amount of noise in the problem by eliminating obtain

poorly organized, small-scale features of the ¬elds

Σ X Y ΣT Y p X = »i p X

i i

X

involved.

ΣT Y Σ X Y p Y = »i p Y ,

i i

Another advantage is that the algebra of the X

problem is simpli¬ed since Σ X X and ΣY Y are

where »i = s X sY . These equations share the

both identity matrices. Thus, according to equa-

same eigenvalues »i > 0, and their normalized

tions (14.10, 14.11), fXi and fY i are eigenvectors of

eigenvectors are related by

Σ X Y ΣT Y and ΣT Y Σ X Y respectively. Since

X X

ΣT Y p X

i

these are non-negative de¬nite symmetric matri-

X

=

i

pY

ces, the eigenvectors are orthogonal. Moreover,

ΣT Y p X

i

the canonical correlation patterns FX = fXi and

i X

ΣT Y p Y

i

FYi = fY i . X

pX = .

i

ΣT Y p Y

i

A minor disadvantage is that the patterns are X

given in the coordinates of the re-normalized 1/2

It is easily shown that Cov ±iX , ±iY = »i .

EOF space (14.20). To express the pattern in the

Thus the pair of patterns associated with the

original coordinate space it is necessary to reverse

largest eigenvalue maximizes the covariance.

transformation (14.20) with (14.18) and (14.19):

The pair of patterns associated with the second

X 1/2 ( f i ) e j

kX

fXi = j=1 (» j ) largest eigenvalue and orthogonal to the ¬rst pair

X jX

maximize the covariability that remains in X ’

Y 1/2 ( f i ) e j

kY

fY i = j=1 (» j ) Y jY ±1 p X and Y ’ ±1 p Y , and so on.

X1 Y1

(14.21)

X ’1/2 ( f i ) e j

kX

= j=1 (» j )

i This method is often called ˜SVD™ analysis.

FX X jX

This wording is misleading because it mixes

Y ’1/2 ( f i ) e j

kY

FYi = j=1 (» j ) the de¬nition of a statistical parameter with the

Y jY

algorithm used to calculate the parameter. These

where (·) j denotes the jth element of the vector

patterns can be calculated by SVD but there are

contained within the brackets. The canonical

other ways, such as conventional eigen-analysis,

correlation patterns are no longer orthogonal after i

to get the same information. Patterns p X and

this backtransformation, and vectors f i and F i i

p Y are often called left and right singular

are no longer identical.7

vectors. The nomenclature is again misleading

because the relevant property of these vectors

14.1.7 Maximizing Covariance”the ˜SVD

is that they maximize covariance. We therefore

Approach.™ Another way to identify pairs of

call this method Maximum Covariance Analysis

i i

coupled patterns p X and p Y in random ¬elds X

(MCA) and call the vectors Maximum Covariance

and Y is to search for orthonormal sets of vectors Patterns.

such that the covariance between the expansion Two properties of MCA are worth mentioning.

coef¬cients ±iX = X, p X and ±iY = Y, p Y ,

i i

• MCA is invariant under coordinate transfor-

Cov ±iX , ±iY = ( p X )T Σ X Y p Y ,

i i (14.22) mation only if the transformation is orthog-

onal. The eigenvalues, and thus the degree of

is maximized. Note that we explicitly require

covariability, change when the transformation

orthonormal vectors so that X and Y can be

is non-orthonormal.

expanded as X = ±iX p X and Y = ±iY p Y .

i i

The solution of (14.22) is obtained as in [14.1.1] • MCA coef¬cients ±iX and ± X , i = j, are

j

by using Lagrange multipliers to enforce the generally correlated. They are uncorrelated

constraints ( p X )T p X = 1 and ( p Y )T p Y = 1. The

i i i i

when Σ X X = σ X I. This also applies to

2

result is a system of equations,

Y-coef¬cients.

ΣX Y p Y = sX p X

i i

See Wallace, Smith, and Bretherton [411] for

(14.23)

= sY p Y ,

ΣT Y i i examples.

pX

X

that can be solved by a singular value decompo-

sition (Appendix B). The same solution is obtained 14.1.8 Principal Prediction Patterns. Suppose

{Zt } is a multivariate time series and de¬ne

7 Note the similarity between this discussion and that in

Xt = Zt and Yt = Zt+„ (14.24)

[14.1.4].

14: Canonical Correlation Analysis

322

for some positive lag „ . Application of the performed at the (1 ’ p) — 100% signi¬cance

˜

level by comparing χ ˜

2 (14.25) against the p-

CCA algorithm, with prior EOF truncation if the

dimension of Zt is large, identi¬es patterns F0i = quantile of the approximating χ 2 distribution (see

FX and F„i = FYi that tend to appear together,

i Appendix E).

Glynn and Muirhead [142] give a bias correction

that is, patterns with a ¬xed time lag in the same

for ρ i and also give an expression for the

variable. Thus the presence of F0i at a given time

asymptotic variance of the corrected estimator

indicates that it is likely that pattern F„i will

emerge „ time units later. Because of the properties that is useful for constructing con¬dence intervals.

Using the Fisher z-transform (recall [8.2.3]),

of CCA, patterns F0i and F„i depict the present

Glynn and Muirhead show that if