3.2 HISTORY AND GENESIS

The history of the Pareto distribution is lucidly and comprehensively covered in the

above-mentioned monograph by Arnold (1983). We shall therefore provide selected

highlights of his exposition supplemented by several additional details that have

emerged in the last 20 years or so and describe the very few earlier historical sources

not covered in Arnold (1983). The history of Pareto distributions is still a vibrant

subject of modern research. The Web site sponsored by the University of Lausanne

where Vilfredo Pareto spent some 15 productive years devoted to Walras and Pareto

(http://www.unil.ch/cwp/) constantly updates the information on this topic, and we

encourage the interested reader to consult this valuable source of historical research

to enrich his or her perspective on the income distributions.

3.2.1 Early History

As was already mentioned in Chapter 1, Pareto (1895, 1896) observed a decreasing

linear relationship between the logarithm of income and the logarithm of Nx , the

number of income receivers with income greater than x; x ! x0 ; when analyzing

income reported for income tax purposes. Hence, he speci¬ed

log Nx ¼ A À a log x, (3:8)

that is,

Nx ¼ eA xÀa , (3:9)

where A, a . 0. Normalizing by the number of income receivers N :¼ Nx0 , one

obtains

Àa

Nx x

x ! x0 . 0:

¼ 1 À F(x) ¼ , (3:10)

N x0

Almost immediately, public interest was aroused and other economists began to

criticize the idea of a universal form with a single shape parameter permitting

inappropriate comparisons between societies. It became clear fairly soon that the

62 PARETO DISTRIBUTIONS

Pareto distribution is only a good approximation of high incomes above a

certain threshold and also that the mysterious and some claim notorious a is not

always close to 1.5 as Pareto initially believed. Nonetheless, as late as 1941

H. T. Davis considered the value a ¼ 1:5 to be a dividing line between egalitarian

societies (a . 1:5) and inegalitarian ones (a , 1:5). [Kakwani (1980b),

reminiscent of Mandelbrot (1960), referred to a Pareto distribution with a ¼ 1:5

as the strongest Pareto law.]

It should be noted that Pareto™s discovery was initially met with some

resentment by the English and American school (see our biography of Pareto in

Appendix A for further details), with notable exceptions such as Stamp (1914) and

Bowley (1926). And as late as 1935 Shirras (p. 680) asserted that for Indian

income tax and super tax data from the 1910s and 1920s (notably for the year

1929 “1930)

There is indeed no Pareto law. It is time that it should be entirely discarded in studies on the

distribution of income.

From the graphical evidence Shirras provided, one is inclined to conclude that if

anything, the data are very much in agreement with a Pareto distribution (Adarkar

and Sen Gupta, 1936).

In his defense of the Pareto distribution as an appropriate model for personal

incomes and wealth, MacGregor (1936) opened with the statement

Economics has not so many inductive laws that it can afford to lose any.

and asserted that “the law and the name mark a stage in investigation, like Boyle™s

Law or Darwin™s Law, and, although amended, they remain authoritative as ¬rst

approximations not to be lightly gone back on.” One year later Johnson (1937) was

able to con¬rm, for U.S. income tax data for each year for the period 1914 “1933,

^

that a does, in fact, not vary substantially, obtaining estimates a [ [1:34, 1:90].

In the actuarial literature an early contribution was made by a Norwegian

actuary Birger Meidell, who in 1912 employed the Pareto distribution when

trying to determine the maximum risk in life insurance. His working hypothesis

was that the sums insured are proportional to the incomes of the policy holders,

thus incorporating Pareto™s pioneering work. The same idea was later expressed

by Hagstrm (1925). In subsequent actuarial investigations the Pareto distribution

was more often used in connection with nonlife insurance, notably auto-

mobile and ¬re insurance, and we shall mention several relevant contributions in

Section 3.7.

3.2.2 Pareto Income Distribution Derived from the Distribution

of Aptitudes

Many (particularly early) writers alluded to a relationship between the distribution of

income and the distribution of talents or aptitudes (e.g., Ammon, 1895; Pareto,

63

3.2 HISTORY AND GENESIS

1897a; Boissevain, 1939). Rhodes (1944), in a neglected paper, assumed that

“talent” is a continuous variable Z, with density h(z), say, with an average income

accruing to those with talent z being given by m(z) ¼ E(X jZ ¼ z). His crucial

assumption is that the conditional coef¬cient of variation l ¼ CV (X jZ ¼ z) is

constant for all talent groups.

If we write the conditional survival function of income (for those with talent z) in

standardized form

!

x À m(z)

F (3:11)

lm(z)

where, by assumption, lm(z) is the conditional standard deviation of income, the

unconditional distribution is given by

!

°1

x À m(z)

F (x) ¼ h(z)F dz: (3:12)

lm(z)

0

Setting x À m(z) ¼: lm(z)v, we can write

x

:

m(z) ¼ (3:13)

1 þ lv

This allows us to obtain z as a function of x and v. However, we require the

distribution of X . In order to obtain an expression for the c.d.f. of this random

variable, we use

lx

@m(z) dz

: (3:14)

¼À

(1 þ lv)2

@v dv

Setting M (v, x) :¼ @m(z)=@v and P(v, x) :¼ h(z), we can now write

° v2

P(v, x)F (v)lx

dv: (3:15)

M (v, x)(1 þ lv)2

v1

The new limits of integration are given from (3.13), yielding m0 :¼ m(0) ¼

x=(1 þ lv2 ) for the lowest value of z corresponding to no talent. Thus,

v2 ¼ (x=m0 À 1)=l. For simplicity, assume that m(z) ! 1 for z ! 1 (“in¬nite

talent implies in¬nite income”), which further yields v1 ¼ À1=l. Hence, (3.15) can

be rewritten as

° (x=m0 À1)=l

P(v, x)F (v)lx

dv: (3:16)