The design of storm sewage systems, roads, and It is important to understand that the extreme

other structures in a city is constrained by the is a realization of a random variable, namely

largest precipitation event anticipated during a the N th order statistic (see[2.6.9]) of a sample

¬xed design period (typically 50 or 100 years). The of size N . The extreme value in a subsequent

design of electrical distribution systems, buildings sample of equal size is another realization of

and other free-standing structures must account for the same random variable.

2: Probability Theory

46

• This process is repeated over several time accumulations match those of a corresponding

intervals in order to obtain the object of sample of moving window accumulations.

extreme value analysis: a sample consisting Bruce [69] describes how the correction factor

only of each interval™s extreme value. In the is estimated (see also Watt [416], p.76, and

Hersh¬eld and Wilson [176]).16

previous example, if the daily precipitation

accumulation is observed over a period of 50

years, then the sample of extreme values to 2.9.2 Model Identi¬cation. In extreme value

be analysed is also of size 50, since each year analysis, the behaviour of the sample of extremes

yields one maximum. is almost always represented by a parametric

model, a probability distribution selected for its

Extreme value analysis requires some sort of ability to indicate the characteristics of the extreme

assumption about the stationarity and ergodicity values reasonably well.17 Asymptotic arguments

of the climate system, since only one realization can be used to select the extreme value distribution

of the past climate is available from climate if something is known about the distribution of the

archives. The implicit working assumption in most random variable observed on short time scales; an

extreme value analyses is that the sample of n alternative approach is to use the extreme values

extremes are realizations of n independent and themselves to identify a suitable model. Both

identically distributed random variables (we will methods will be brie¬‚y discussed here.

discuss suitable distributions for extreme values

shortly). Sometimes, though, it is clear that the

2.9.3 Model Identi¬cation: The Asymptotic

climate system violates this assumption on certain

Approach. Asymptotic arguments are often an

time scales. For example, during an El Ni˜ o, the

n

important part of selecting an extreme value

statistical characteristics of precipitation change

distribution. Under fairly general conditions it can

on time scales of less than a season.

be shown that, in samples of size n, the distribution

The following are examples of the context in

of the extreme values converges, as n ’ ∞, to one

which extreme value analyses are conducted.

of three models: the Gumbel (or Pearson type I, or

Structural engineers designing a transmission

EV-I) distribution, the Pearson type II (or EV-II)

tower may require knowledge about the extremes

distribution, and the Pearson type III (or EV-III)

of the ¬ve-minute mean wind speed. They would

distribution.18

extract daily, monthly or annual maxima of

The rate of convergence is largely determined by

¬ve-minute mean wind speed for a particular

the upper (sometimes lower) tail of the distribution

location from climatological archives for a nearby

of the short time scale variable (e.g., daily

observing station.

precipitation) that generates the extremes.19 If the

Civil engineers designing a ¬‚oodway around a

city might require knowledge about the extremes 16 The correction factor used to convert ¬xed window 24-hour

of 24-hour precipitation, and will therefore extract precipitation accumulations to moving window accumulations

in Canada is 1.13 [69]. This factor will vary with location

daily, monthly and annual maxima of 24-hour

depending upon how and when precipitation is produced. The

precipitation from climatological archives. factor also depends upon the accumulation period.

A frequently encountered dif¬culty with 17 See Section 4.2 for a discussion of the difference between

archived precipitation data is that the archives parametric and non-parametric statistics.

18 In the classical treatment of extreme value analysis (see

generally contain the accumulation for a

Gumbel [149]) it is necessary to assume that the extremes

¬xed 24-hour period (usually beginning at 00 come from samples that can be represented by independent and

UTC) as opposed to moving window 24-hour identically distributed random variables. Leadbetter et al. [246]

accumulations. This is of concern because often show that the independence assumption can be substantially

relaxed. The same asymptotic results obtained in the classical

the critical quantity is not, for example, the

setting are obtainable when the extremes are those of samples

maximum amount of rain that falls in a 24-hour taken from a weakly stationary, ergodic time series (see

time scale that consistently begins at 00 UTC [10.2.1]).

19 When we speak of the ˜convergence™ of a sequence of

(i.e., a ¬xed 24-hour window), but the maximum

random variables, say Yi , i = 1, 2, . . ., to another random

amount of rain that falls in a 24-hour period

random variable Z we mean either convergence in distribution

starting at any time of the day (i.e., a moving or convergence in mean square. We say Yi converges to Z in

24-hour window). Therefore a nuance of the d

distribution, and write Yi ’ Z if P (|Yi ’ Z| > ) ’ 0 as

analysis of extreme precipitation is that the i ’ ∞ for every > 0. We say Yi converges to Z in mean

¬xed window accumulations must be multiplied ms

square, and write Yi ’ Z if E (Yi ’ Z)2 ’ 0 as i ’ ∞.

by an empirically derived constant to ensure Convergence in mean square usually implies convergence in

that the extremes of a sample of ¬xed window distribution.

2.9: Extreme Value Distributions 47

distribution of the extreme values converges, to say Another frequently used method of model

the Gumbel distribution, then we say that the short identi¬cation relies on estimates of the skewness

time scale variable lies in the domain of attraction and kurtosis of the extreme value distribution that

of the Gumbel distribution. are computed from the sample of extremes. The

The EV-I distribution will be described brie¬‚y (skewness, kurtosis) pair is plotted on a chart of

below. Descriptions of the EV-II and III distribu- kurtosis as a function of skewness for various

tions can be found in Gumbel [149] or Leadbetter families of distributions, often called Pearson

et al. [246]. curves (see Elderton and Johnson [112]). A model

Both the exponential distribution and the normal is identi¬ed by the proximity of the plotted point

distribution lie in the domain of attraction of to a distribution™s curve (there is a unique Pearson

the EV-I distribution. However, the distribution curve for every distribution).

of the largest of a sample of n independent Model identi¬cation with Pearson curves is

and identically distributed exponential random dif¬cult and often not completely successful

variables is closer to the EV-I distribution than because the skewness and kurtosis estimates are

the distribution of the largest of a sample of subject to a great deal of sampling variability.

n independent and identically distributed normal Estimates often end up occupying a point in the

random variables. Thus, Cook [89] argues that (skewness, kurtosis) plane that can not be visited

it is better to do extreme value analyses on by adjusting parameters within known families of

wind pressure (which is proportional to wind distributions.

speed squared) than on wind speed because A better alternative is to use L-moments in com-

the former has a distribution that is closer to bination with L-moment versions of the Pearson

exponential, and therefore closer to EV-I. Zwiers curves [183] for model identi¬cation. L-moments

[439] makes use of this argument in his analysis are subject to less sampling variation, that is,

of extreme wind speeds at several Canadian they are more robust than conventional moments

observing stations. and discriminate better between competing models

(see Hosking [183]).

2.9.4 Model Identi¬cation: Using the Data.

Unfortunately, the asymptotic EV distributions do 2.9.5 Model Fitting. Once a model (i.e., ex-

not always ¬t the observed extremes well. This treme value distribution) has been selected the next

can occur for a variety of reasons, not the least step in the analysis is to ˜¬t™ the chosen extreme

of which is the cyclo-stationarity of the climate value distribution to the sample of extremes. Fit-

data under the best of conditions. Consider, for ting means estimating the unknown parameters of

example, the daily precipitation accumulation. the chosen extreme value distribution.

While the annual maximum daily precipitation Several methods may be used for parameter

accumulation is formally the maximum of 365 estimation. These methods may produce quite

observations, the effect of the annual cycle may be different results with the small sample of extremes

such that only a small number of observations have that is usually available, even though their results

any chance at all of attaining the status of annual become asymptotically identical as the number of

maximum. observed extremes becomes large. The theoretical

At Vancouver (British Columbia, Canada), for suitability of one method over another in repeated

example (see Figure 1.7), the annual maximum sampling has often been the subject of literal

is usually generated during winter when there is debate. However, these discussions are of little use

strong on-shore ¬‚ow from the south-west. It is when economic decisions strongly depend on the

accuracy of the results, as is often true.20

apparent from Figure 1.7 that only about 60 days

of the year have the potential to generate the annual The methods most often used for ¬tting (see

maximum at Vancouver. On the other hand, the Section 5.2) are

annual maximum can occur with approximately

• the method of moments,

equal likelihood on any day of the year on Sable

Island (see Figure 1.7), located on the east coast of

• the method of maximum likelihood,

Canada.

20 For example, estimates of the largest precipitation event

Because the asymptotic distribution is not

expected to occur during a 25-year period will strongly