operator. TΣ 1 is the X-component most strongly

that X X Y a

Thus, redundancy analysis offers a number of correlated with YT a 1 .

useful insights. First, it helps us to identify an Thus redundancy analysis and CCA are equiva-

ef¬cient way of specifying a maximum of variance lent in this special case: both identify the same X

in one random vector from the information and Y directions.

provided by another vector. It also guides us In general, however, the methods are not

in ¬nding those components of the specifying equivalent. Redundancy analysis ¬nds the best

variable that contain the most information about predicted (or speci¬ed) components of Y by

the variable to be speci¬ed. Finally, it offers pairs ¬nding the eigenvectors a of

of patterns that are mapped onto each other. If we

observe the pattern p j in the specifying ¬eld, then ΣY X Σ’1 Σ X Y

XX

the likelihood of observing pattern a j in the ¬eld

and then ¬nding the patterns p of X-variations

to be speci¬ed is increased.

that carry this information. CCA, on the other

If we consider the full X-space, we ¬nd that

hand, ¬nds the most strongly correlated compo-

ΣY Y = ΣY X Σ’1 Σ X Y . (14.52) nents of Y by ¬nding the eigenvectors Σ’1/2 f of

ˆˆ XX

YY Y

When comparing this expression with the eigen- ’1/2 ’1/2

(ΣY Y )T ΣY X Σ’1 Σ X Y ΣY Y .

problem (14.43), it becomes obvious that the a- XX

vectors are the EOFs of Y. Thus the a 1 coef¬cient That is, CCA does redundancy analysis on Y =

’1/2

accounts for the largest amount of Y variance (i.e., (ΣY Y )T Y, the random vector that is obtained

»1 ), a 2 accounts for the second largest amount of

by projecting Y onto its EOFs and scaling each

variance »2 , and so on. The total variance of the

component by its standard deviation. We can

regressed vector Y is j » j . Since ΣY Y = ΣY Y ,

ˆ ˆˆ therefore anticipate that the two techniques will

we have produce similar results if Y is projected onto a

small number of EOFs with similar eigenvalues.

R 2 (Y : Y) = R 2 (Y : X)

j »j

tr(ΣY Y )

ˆˆ

= = . (14.53) 14.4.9 Example: Interdecadal Variability of

tr(ΣY Y ) tr(ΣY Y )

Intramonthly Percentiles of Signi¬cant ˜Brent™

When we truncate (14.49) to the k components of Wave Height. We now describe an application

X that carry the most information about Y, we ¬nd in which we use redundancy analysis to specify

that monthly wave height statistics at the Brent oil

¬eld, located northeast of Scotland in the North

k

j=1 » j Atlantic at (61—¦ N, 1.5—¦ E). Wave height (sea state)

R 2 (Y : Y) = R 2 (Y : B T X) = .

tr(ΣY Y )

k

data are available from visual assessments made

on ships of opportunity, at light houses, from

14 The proof is straightforward:

RP = ΣY X Σ’1 (BT )’1 = ΣY X Σ’1 Σ X X B = wave rider buoys, and shipborne instruments at

XX XX

ΣY X B = AD. ocean weather stations. Also, wave height maps

14: Canonical Correlation Analysis

332

Wave height percentile

50% 80% 90%

’81 ’107 ’114

a1

’25

a2 32 2

Table 14.2: The vectors a k of anomalous

intramonthly percentiles of signi¬cant wave height

are given as rows in the following table. Units: cm.

statistical model is fed with the observed air

pressure from the beginning of the century

onward, thereby producing a plausible estimate of

wave height statistics for the entire century. The

statistical model is presented below.

In this case we bring together ˜apples™ and

˜oranges™, that is, two vector quantities that are

not directly linked. One vector time series, Xt ,

represents the winter (DJF) monthly mean surface

air-pressure distributions in the North Atlantic.

The other vector time series, Yt , is a three-

Figure 14.9: First two monthly mean air- dimensional random vector consisting of the 50th,

pressure anomaly distributions p k identi¬ed in a 80th, and 90th percentiles of the intramonthly

redundancy analysis as being most strongly linked distributions of signi¬cant wave height16 in the

Brent oil ¬eld at (61—¦ N, 1.5—¦ E). Both vector

to simultaneous variations of the intramonthly

percentiles of signi¬cant wave height in the Brent time series are assumed to be centred, so that the

oil ¬eld (61—¦ N, 1.5—¦ E; northeast of Scotland). air-pressure values and percentiles are deviations

from their respective long-term means.

The monthly mean of North Atlantic SLP is

have been constructed from wind analyses for

indirectly linked to the intramonthly percentiles,

the purpose of ship routing (Bouws et al. [57]).

since storms affect both the monthly mean air-

These data are sparse and suffer from various

pressure distribution and the distribution of wave

inhomogeneities. Also, the records are generally

heights within a month at a speci¬c location. Of

too short to allow an assessment of changes during

course, the storm activity may also be seen as

the past century.

being conditioned by the monthly mean state.

Thus, observational data alone do not contain

The daily wave height data are taken from a 40-

suf¬cient information about the interdecadal

year ˜hindcast™ simulation (G¨ nther et al. [153]).

u

variability of wave statistics. One solution is a

The following analysis assumes that the hindcasts

combined statistical/dynamical reconstruction of

and wind¬eld analyses both represent the real

the past that uses a dynamical wave model. The

world well enough for statistical relationships

model is forced with recent wind data that are

between the wave and wind ¬elds on the monthly

believed to be fairly reliable and not strongly

time scale to be reliably diagnosed.

affected by improving analysis techniques.15 The

A redundancy analysis of the two vector time

wave heights derived from the hindcast simulation

series is performed to detect the dominant coupled

are treated as observations and are used to

anomaly patterns in the mean air pressure and

build a statistical model linking the wave heights

in the intramonthly wave height percentiles. The

to surface air pressure. Finally, the resulting

SLP patterns p 1 and p 2 are shown in Figure 14.9

and the corresponding intramonthly percentiles

15 Note that the homogeneity of weather maps and

a 1 and a 2 are listed in Table 14.2. The time

their surface winds is dif¬cult to assess. Analysis system

coef¬cients are normalized to unit variance so that

improvements can introduce arti¬cial signals, such as

increasing frequencies of extreme events, into the hindcast.

16 Signi¬cant wave height is a physical parameter that

Improved analyses procedures, be it more or better observations

or more intelligently designed dynamical and statistical describes the wave ¬eld on the sea surface. The word

analysis tools, lead to the emergence of more details in weather ˜signi¬cant™ does not imply a signi¬cance test in this context.

maps and, therefore, larger extremes. See [3.2.4].

14.4: Redundancy Analysis 333

The second pattern describes a mean south-

4