ñòð. 2 |

Let Î¸ ST = af (aU1 ) + (1 âˆ’ a) f (a + (1 âˆ’ a)U 2 )

where U1 ,U1 are independent U[0,1]. Why weights a,1 âˆ’ a?

E (Î¸ ST ) = E (af (aU1 ) + (1 âˆ’ a) f (a + (1 âˆ’ a)U 2 )) = aE ( f (aU1 ) + (1 âˆ’ a) Ef (a + (1 âˆ’ a)U 2

1 1 a 1 1

= a âˆ« f(au)du +(1 âˆ’ a) âˆ« f(a + (1 âˆ’ a)u )du = âˆ« f(u)du + âˆ« f(u)du = âˆ« f (u )du

0 0 0 a 0

Weights in a stratified sample

â€¢ The weights attached to a given stratum

(interval, region) are proportional to the

length of the interval (or volume of

region).

â€¢ We may allow many observations in a

given stratum and apply weight to the

stratum mean.

General stratified sample

â€¢ Choose

ni from stratum i, i = 1,2,...m

Evaluate the function at ni random points, uniformly distributed over

stratum i and average the result

ni

1

âˆ‘ f (V ) where V

AVi = is uniformly distributed over stratum i.

ij ij

ni j =1

Use WEIGHTED AVERAGE of the stratum mean, weight

proportional to length, (area, or volume) of stratum.

Variance of Stratified Sample

Estimator :

âˆ‘ (x âˆ’ xi ) AVi

i +1

i

Variance of Stratified Sample estimator :

var( f (Vij ))

âˆ‘ ( xi +1 âˆ’ xi ) 2

ni

i

Performance of stratified

sample

â€¢ Example. We chose a=.7 and sampled

100,000 on [0,0.7] and another 100,000 on

[0.7,1].

â€¢ a=.7;

â€¢ F=a*fn(a*rand(1,500000))+(1-a)*fn(a+(1-a)*rand(1,500000));

â€¢ mean(F) % =0.4608)

â€¢ var(F)/length(F) % =9.18e-008

â€¢ Compare with Crude with n=1000000. Variance= 4.3e-007 ;

Efficiency around 5.

Optimal sample sizes for stratified

sample

The optimal samplesize in each stratum is proportional to

(a) the size of the stratum (e.g. interval length)

(b) the stratum standard deviation

So if stratum i is the interval ( xi , xi +1 ) then

optimal samplesize ni proportional to

( xi +1 âˆ’ xi ) var( f ( xi + U ( xi +1 âˆ’ xi ))

var( f ( xi + U ( xi +1 âˆ’ xi )) estimated from a preliminary sample.

The function â€œstratifiedâ€

â€¢ function [est,v,n]=stratified(x,nsample)

â€¢ % input x=vector of strata (e.g. x=[0 .25 .5 .75 1] and nsample=sample

size

â€¢ est=0; n=[]; m=length(x);

â€¢ for i=1:m-1 %preliminary simulation of 10000 to optimize sample

size

â€¢ v= var(fn(unifrnd(x(i),x(i+1),1,1000));

â€¢ n=[n (x(i+1)-x(i))*sqrt(v)];

â€¢ end

â€¢ n=floor(nsample*n/sum(n)); % these are stratum sample sizes

â€¢ v=0;

â€¢ for i=1:m-1

â€¢ F=fn(unifrnd(x(i),x(i+1),1,n(i)));

â€¢ est=est+(x(i+1)-x(i))*mean(F);

â€¢ v=v+var(F)*(x(i+1)-x(i))^2/n(i);

â€¢ end

Stratified Sample with more

strata, optimal sample size

â€¢ Try 4 strata [0 .55] ,[.55 .8],[.8 .94],[.94,1]

with optimal sample sizes

â€“ 32717 78703 46420 42158

â€¢ [est,v,n]=stratified([0 .55 .80 .94 1],500000)

â€¢ estimator 0.4617 variance= 3.7e-008

â€¢ Efficiency now around 24 times that of

Crude. Try also:

â€¢ [est,v,n]=stratified([.47 .62 .75 .87 .96 1],500000)

(var=1.4e-8)

Conditioning as a Variance

Reduction Tool

â€¢ Example: suppose we wish to find the area of a region,

say the area under the graph of a function h(x).

â€¢ Crude Monte Carlo: Find a probability density function

g(x) such that

cg ( x ) â‰¥ h( x ) for all x. Then generate random points

( X i , Yi ) uniformly distribute d under the graph of cg ( x ) :

X i has p.d.f. g ( x ) and Yi = cg ( X i )U i .

The proportion of these points which are also under

the graph of h ( x )

area under h ( x ) area under h ( x )

â†’ =

area under cg ( x ) c

Conditioning (II)

â€¢ Leads to the estimator:

c Ã— Number points under h c n

^

Î¸ CR = Total number of points = n âˆ‘ I (Yi â‰¤ h( X i )) = n âˆ‘ Zi , say

cn

i =1 i =1

Consider an alternative estimator

^

Î¸ Co = n âˆ‘ E[Zi | X i ] = n âˆ‘ cg ( X ) = n âˆ‘ g ( X )

cn c n h( X i ) 1 n h( X i )

i =1 i =1 i =1

i i

We obtained this new estimator by taking the conditional expectation of the old.

Does this always give an unbiased estimator with smaller variance?

Some Properties of Conditional

Expectation

1. E{E[Y | X ]} = E (Y )

2. Var(E[Y | X]) â‰¤ var(Y) with equality if and only if

Y is a function of X .

^

IFÎ¸ is an unbiased estimator and for some random variable X we can compute

^

E (Î¸ | X) then this is another unbiased estimator, typically with smaller variance.

Therefore

var(Zi ) â‰¥ var(E ( Z i | X i )) and the second estimator

cn cn

âˆ‘ E[Zi | X i ] is better than the first n âˆ‘ Zi

n i =1 i =1

Example of Conditioning:

Estimating pi

ñòð. 2 |