• IMdens=eval(strcat(g,'(IM)')); %2*(IM-.47)/(.53)^2;

• FN=eval(strcat(f,'(IM)'));

• est=FN./IMdens; % mean(est) provides the estimator

• v=var(FN./IMdens)/length(IM); % this is the variance of the estimator

• ********************************************************************

• %GIVES mean(est)= 0.4614 v = 6.4329e-008

• EFFICIENCY GAIN OVER CRUDE IS AROUND 35.

Exponential tilting and importance

sampling: Estimating the probability of

rare events

Suppose we wish to estimate an extreme quantile such as VAR 99.9% .

Then we need to estimate the probability that a random variable

exceeds some large value. i.e. when

10

S10 > a where S10 = ‘ X i for independent random variables

i =1

X i all have probability density function f ( x). This is

E{I ( S10 > a)} and is estimate by the corresponding sample mean.

If we conduct the simulation

under this distribution very few of the sums will exceed a. What if we generate X i

as independent random variables under a different (tilted) probability density function

etx

g ( x) = f ( x) where k is chosen so that this is a density function and then use

k

importance sampling.

For examplesuppose that f ( x) = e ’ x . Then the tilted distribution is

1 tx’ x ’ x (1’t )

g t ( x) = e = e — (1 ’ t ) since this is the multiple

k

that makes this function a p.d.f.... another exponential distribution.

If we generate X i from this p.d.f. it has

1

expected valueθ = . The importancesample estimator is

1’ t

n

n f (Z ) n ’t ‘ Z i

n n n

E{I (‘ Z i > a)∏ } = E{I (‘ Z i > a)∏θ e } = θ E{I (‘ Z i > a)e i=1 }

’tZ i

i n

i =1 g t ( Z i )

i =1 i =1 i =1

i =1

where Z i are independent with exponential(θ ) p.d.f .

What is the best valueof θ ? The varianceof the importancesample estimator is

n

small when we choose t so that ‘ Z i is approximately a. Choose t = 1 - 1/θ so that

i =1

E ( Z i ) = θ = a / n so chooseθ approximately a/n.

Matlab Code for Example, a=22.6,

n=10

• N=500000

• X=exprnd(1,10,N);

• est1=(sum(X)>22.6);

• mean(est1) % 0.001

• var(est1) % 8.8e-004

Using Importance sampling

• Z=exprnd(2.5,10,N);

• S=sum(Z); theta=2.5

• est2=theta^10*exp(-(1-1/theta)*S).*(S>22.6);

• mean(est2) % =0.001

var(est2) % 5.5e-6, and SE about 0.000003

•

• EFFICIENCY GAIN FROM IMPORTANCE SAMPLING IS ABOUT

180.

Common Random Numbers:

estimating a difference

• Suppose we wish to estimate the

difference between two systems or

values: e.g.

“ Estimate the slope of a function at a point

“ Estimate the difference in performance

between two pieces of equipment

“ Estimate the difference between a call

option price for r=0.05 and r=0.06.

Importance Sampling, Tilting

and the Saddlepoint

Approximation

The Edgeworth Expansion of the probability density function

of the sample mean of i.i.d. random variables takes the form

κ

n( x; µ , σ ){1 + [ z ’ 3 z ] + O(1 / n)}

2 3

6n

where z = n ( x ’ µ ) / σ and κ = E ( X ’ µ ) 3 , and

n( x; µ , σ 2 ) is the usual normal density function approximation.

Notice that at x = µ , the correction is zero and the normal approximation

is accurate to O(1/n).

Consider tilting the original distribution so as to achieve this...

If we wish to approximate the proability density function f(x) of X

at the point x and X has cumulant generating function

K(t) = ln E{exp( Xt )}, then

Saddlepoint(cont)

θx ’ K (θ )

the density fθ ( x) = e f ( x) has mean given by

K ' (θ ) so if we choose θ so that K ' (θ ) = x the mean is

x and the variance K " (θ ).

The Edgeworth approximation to the density fθ

1

is n( K ' (θ ), K ' (θ ), K " (θ )) =

2πK " (θ )

Therefore the (saddlepoint) approximation to f ( x) is

1 K (θ ) ’θx

, where K ' (θ ) = x.

e

2πK " (θ )

Two steps of saddlepoint

• Saddlepoint approximation has two steps

“ Tilt the original density so that mean of tilted

distribution is around the point of interest x

“ Use the normal approximation to the tilted

distribution.

• Importance sampling from the tilted

density avoids the second step.

Tilting and large deviations (Robert

and Casella p 132)

Large deviations results concern the tail of a distribution of

partial sums. e.g.

P[ S n > x] when this probability is very small.

Cramer' s Theorem.

Put I ( x) = θx ’ K (θ ) where again θ is such that K ' (θ ) = x. Then

1

ln( P[ S n > x]) ˜ ’ I ( x)

n

Student presentation 11.

Importance Sampling (Robert and

Casella 85-87)

• Suppose X as the student t distribution with 12

degrees of freedom. Estimate

⎛ ⎞

5

X

E⎜ I ( X ≥ 0) ⎟

⎜ 1 + ( X ’ 3) 2 ⎟

⎝ ⎠

using crude Monte Carlo and imporance sampling

and various choices of the importance

distribution including Cauchy, U[0,1/2.1], and

normal distribution. Which do you prefer and

why?

Reducing Variance when estimating

the difference of Expected values

In general we wish to estimate