β is required for the initial and ¬nal states, intermediate values may

be chosen di¬erently. This property may be exploited to produce a

faster sampling.

13 See Park and Schulten (2004) for a discussion of various ensembles.

246 Free energy, entropy and potential of mean force

7.8.4 Statistical considerations

Since the averaging is done over an exponential function of the work done,

the trajectories with the smaller work values will dominate the result. This

produces erratic jumps in cumulative averages whenever occasional low val-

ues appear. The statistical properties and validity of approximations have

been considered by several authors (Park et al. 2003, Hummer 2001, Ytre-

berg and Zuckerman 2004). Let us consider a simple example.

We sample a property x (the “work”) from a distribution function p(x),

and we wish to compute the quantity A:

1 1

ln e’βx = ’ ln p(x)e’βx dx.

A=’ (7.98)

β β

Without loss of generality, we make take the average of x as zero, so that all

values refer to the average of the distribution. First consider the cumulant

expansion14 in powers of β, obtained from a simple Taylor expansion:

1 1 1

A = ’ β x2 + β 2 x3 ’ β 3 [ x4 ’ 3 x2 + · · ·].

2

(7.99)

2! 3! 4!

For a Gaussian distribution,

√ x2

’1

exp ’ 2 ,

p(x) = (σ 2π) (7.100)

2σ

only the ¬rst term survives, as can easily be seen by direct integration:

1

A = ’ βσ 2 . (7.101)

2

Figure 7.7 shows that the cumulative average gives very poor convergence.

For this ¬gure 1000 points have been sampled from a normal distribution

of zero average and unit variance, and the cumulative exponential averages

were calculated with values of β equal to 1, 2 and 4. The theoretical values

for A are ’0.5, ’1 and ’2, respectively. For β = 1 convergence is reached

after about 600 points; for β = 2 1000 points are barely enough, and for β =

4 1000 points are clearly insu¬cient to reach convergence. This means that

computing the exponential average is hardly an option if the computation

of one path takes a considerable computational e¬ort.

The route via the cumulant expansion (7.99) gives very accurate results

if the distribution is known to be Gaussian and (7.101) applies. For n inde-

pendent samples, the variance (mean-square error) in the estimated average

14 See Zwanzig (1954), who de¬ned the cumulant expansion of ln exp(’βx) in a power series in

(’β)n /n!.

7.8 Free energy from non-equilibrium processes 247

A

0

β=1

β=2

’1

β=4

’2

’3

200 400 600 800 1000

n

Figure 7.7 Cumulative average of A = ’β ’1 ln exp(’βx) over n samples drawn

from a normal distribution (average 0, variance σ 2 = 1). The theoretical limits are

’0.5β, indicated by dotted lines.

x = xi /n is σ 2 /n, while the mean-square error in the estimated variance

s2 = (x ’ x )2 /(n ’ 1) is 2σ 4 /(n ’ 1) (Hummer, 2001):

1/2

σ2 β 2σ4

1 ˆ ˆ

A = x ’ β σ2 ±

ˆ + , (7.102)

n’1

2 n

where

1

x= xi (7.103)

n

1

(xi ’ x )2 ,

σ2 =

ˆ (7.104)

n’1

are best, unbiased, estimates for the average and variance. However, if the

distribution is not Gaussian, the higher-order cumulants rapidly add to the

inaccuracy.

Ytreberg and Zuckerman (2004) propose, based on an extensive error

analysis, to select many random sequences of m < n samples from a set

of n data and plot the exponential averages thus obtained versus m’1/2 .

Extrapolation to n ’ ∞ then corrects for a datasize-dependent systematic

bias in the averages, and an error estimate can be obtained.

248 Free energy, entropy and potential of mean force

In practice, Jarzynski™s equation can only be used if the irreversible work

is small (not exceeding 2kB T ), i.e., if the process is close to equilibrium.

As Oostenbrink and van Gunsteren (2006) have shown, integration from

equilibrated intermediate points is generally much more e¬cient than both

irreversible fast growth and near-equilibrium slow growth. It is not clear

whether this still holds when optimal corrections are applied to the integra-

tion by fast or slow growth.

8

Stochastic dynamics: reducing degrees of freedom

8.1 Distinguishing relevant degrees of freedom

Often the interest in the behavior of large molecular systems concerns global

behavior on longer time scales rather than the short-time details of local dy-

namics. Unfortunately, the interesting time scales and system sizes are often

(far) beyond what is attainable by detailed molecular dynamics simulations.

In particular, macromolecular structural relaxation (crystallization from the

melt, conformational changes, polyelectrolyte condensation, protein folding,

microphase separation) easily extends into the seconds range and longer. It

would be desirable to simplify dynamical simulations in such a way that

the “interesting” behavior is well reproduced, and in a much more e¬cient

manner, even if this goes at the expense of “uninteresting” details. Thus we

would like to reduce the number of degrees of freedom that are explicitly

treated in the dynamics, but in such a way that the accuracy of global and

long-time behavior is retained as much as possible.

All approaches of this type fall under the heading of coarse graining,

although this term is often used in a more speci¬c sense for models that

average over local d etails. The relevant degrees of freedom may then either

be the cartesian coordinates of special particles that represent a spatial

average (the superatom approach, treated in Section 8.4), or they may be

densities on a grid, de¬ned with a certain spatial resolution. The latter type

of coarse graining is treated in Chapter 9 and leads to mesoscopic continuum

dynamics, treated in Chapter 10.

The ¬rst choice is to distinguish relevant degrees of freedom from irrelevant

degrees of freedom. With “irrelevant” we do not mean unimportant: these

degrees of freedom can have essential in¬‚uences on the “relevant” degrees

of freedom, but we mean that we don™t require knowledge of the detailed

behavior of those degrees of freedom. This choice is in a sense arbitrary and

249