that can occur when there is dependence between about the veri¬cation subset because the sum

the two samples. of anomalies across both subsets is constrained

Avoiding dependence is more dif¬cult than it to total zero. The distortion in skill estimates

sounds. If the data are serially correlated it may that are caused by this kind of geometrical

be necessary to separate the learning data from the dependence can be large when the validation

subsets are small.13 It is therefore imperative

validation data in every cross-validation iteration

by a buffer of observations that is long enough that the entire process that turns data into

to ensure that the learning and validation data a ¬tted model, including the calculation of

are statistically independent. Even when serial climatologies, anomalies, and so on, be cross-

correlation is not a problem, there are still a validated.

13 When the validation subset is of size 1, which is often the

case, the validation anomaly is completely determined by the

sum of anomalies in the learning subset.

Part VII

Appendices

This Page Intentionally Left Blank

A Notation

Throughout this book we use the following notation.

• Real- and complex-valued univariate random variables are given as bold-faced upper-case letters,

such as A or X.

• A random sample of size n from a univariate population is generally represented by a collection of

independent and identically distributed (iid) random variables {X1 , . . . , Xn }.

• The kth order statistic (see [2.6.9]) is denoted by X(k|n) .

• Vector random variables are given as bold-faced upper-case letters with a vector on top, for

example, A or X. The components of a vector are labelled by subscripts, for instance X =

(X1 , . . . , Xm )T , where m is the length of the vector.

• A random sample of size n from a multivariate population is generally represented by a collection

of iid random vectors {X1 , . . . , Xn }. The kth element of X j is identi¬ed as X jk , X j,k , or sometimes

(X j )k .

• Univariate stochastic processes in discrete time are identi¬ed by {Xt : t ∈ Z} or sometimes simply

as {Xt }. Multivariate stochastic processes are denoted analogously.

• Realizations of a random variable, for example, B or B, are denoted by bold faced lower case

letters, such as b or b.

• Matrices are denoted with calligraphic letters, such as A or X . An m — n matrix has m rows and n

columns. The matrix element in the ith row and the jth column is denoted ai j .

• Sets of numbers or vectors are denoted by upper case Greek characters, such as .

• Statistical parameters are denoted by lower case Greek characters, such as θ or ±, or upper case

letters in italics, such as T .

• Estimated statistical parameters are denoted with a ˜ · ™, as in θ or T .

• De¬nitions are stated in italics. When new expressions are introduced, they are often written in

italics or enclosed in quotation marks.

• Footnotes contain additional comments that are not important for the development of the arguments

or concepts. They are sometimes used to explain expressions that may be unknown to some readers.

409

Appendix A: Notation

410

Special Conventions

• The sample space is given by S. Subsets of S (i.e., events) are indicated by upper case italics, such

as A or B.

• The probability of an event A ∈ S is given by P (A).

• Sample sizes are usually denoted by n and the dimension of a vector by m.

• We use the notation f X to represent the probability function or the probability density function of

a continuous random variable X. Likewise, the distribution function is given by FX .

• A vertical bar ˜|™ is used to denote conditioning, as in P (A|B) or f X |Y (x|Y = y).

• The expectation operator, applied to a random variable X, is indicated by E(X).

n

• Averaging in time or over a sample is indicated by a horizontal over-bar, as in x = 1

i=1 xi .

n

• The covariance matrix of a random vector X is represented by Σ or Σx x . The covariance between

vector elements Xi and X j is denoted σi j .

• The cross-covariance matrix between random vectors X and Y is denoted Σx y .

• Correlation is denoted by ρ. Estimated correlations are denoted by ρ or r .

• Lags in time are denoted by „ .

• The auto-covariance function of a weakly stationary time series {Xt : t ∈ Z} is denoted γ („ ) or

γx x („ ). The corresponding estimator is denoted γ („ ), γ x x („ ), or sometimes cx x („ ).

• The cross-covariance function of a weakly stationary bivariate time series {(Xt , Yt )T : t ∈ Z} is

denoted γx y („ ). The corresponding estimator is denoted γ x y („ ) or cx y („ ).

• The spectral density function of a weakly stationary time series {Xt : t ∈ Z} is denoted (ω) or

x x („ ). The cross-spectral density density function of a weakly stationary bi-variate time series

{(Xt , Yt )T : t ∈ Z} is denoted x y (ω). x y (ω) and x y (ω) denote the co- and quadrature

spectra; A x y (ω) and x y (ω) denote the amplitude and phase spectra; κx y (ω) denotes the (squared)

coef¬ciency spectrum.

• The symbols µx and µx are reserved for ensemble mean values of a random variable X and a

random vector X respectively. Subscript ˜x™ will often be omitted for convenience.

• The symbol σ represents a standard deviation, its square σ 2 is a variance. If required for clarity, the

name of the random variable is added as a subscript, for example, σx denotes the standard deviation

of X.

• We write X ∼ N (µ, σ 2 ) if X is normally distributed with mean µ and variance σ 2 (see [2.7.3]

and Appendix D). We write X ∼ B(n, p) and say discrete random variable X has a binomial

distribution when X is the number of successes in n independent Bernoulli trials with probability p

of success on any trial (see [2.2.2]). We write X ∼ χ 2 (k) if X has a χ 2 distribution with k degrees

of freedom (see [2.6.8] and Appendix E). Similarly, we write X ∼ t(k) if X has a t distribution

with k degrees of freedom (see [2.6.8] and Appendix F). We indicate that X has an F distribution

with k and l degrees of freedom by writing F(k, l) (see [2.6.10] and Appendix G).

Because of their historical background, the normal distribution and the t distribution are often

called the Gaussian distribution and Student™s t distribution, respectively. To preserve simplicity

and clarity in our notation, we do not use these expressions.

• Geographical latitude and longitude are denoted (», φ). The vertical coordinate is labelled z or p.

• The symbol » is also used to identify eigenvalues.

Notation 411

• The size of a con¬dence interval is denoted as p — 100%, where p is a probability between 0 and 1

˜ ˜

(a typical value is 0.95). Signi¬cance levels are denoted as (1 ’ p) — 100%.

˜

Mathematical Operators

• The complex conjugate of a complex number x is indicated with a star: x — .

• The transpose of a matrix A or a vector x is denoted with a superscript T: AT or x T .

• The complex conjugate of a complex matrix C or vector c is denoted C — or c— . The conjugate

transpose operation is indicated by C † or c † .

• The dot product (also scalar or inner product) of two vectors a and b is given by: a, b = a T b— =

b† a = i ai bi— .

• The norm of a vector a is given by a = a, a .

• The sign operator is given by sgn(x) = ’1 if x < 0 and sgn(x) = 1 if x ≥ 0.

p p!

• The symbol q represents for integers p, q with p ≥ q, where 0! = 1 and p! =

q!( p’q)!

1 — 2 — · · · — p.

• The Fourier transform F is an operator that operates on series st with ∞

t=’∞ |st | < ∞, such that

∞

F {s}(ω) = t=’∞ st e’i2π tω . The result of the Fourier transform, F {s}, is a complex function

de¬ned on the real interval [’ 1 , 1 ]. See also Appendix C.

22

A brief summary of some essentials about linear bases, eigenvalues, and eigenvectors can be found in

Appendix B.

Abbreviations and Technical Expressions

Frequently used abbreviations include:

• AGCM, or simply GCM: (Atmospheric) General Circulation Model. These are detailed models that