a statistic generated from a sample survey, will “ albeit for some, often loosely

and impressionistically “ locate their survey statistic in an interpretive boundary

based on what they know about basic sampling theory.

An overview of the theory of probability sampling

At the heart of sampling theory is the observation that we can prove, math-

ematically, that the means of all possible samples in a population will equal

the population mean. So if, for example, we were to draw all possible samples

of men from the UK population and plot the average height that each of our

samples revealed, we would ¬nd that the distribution of all our survey plottings

of these height means followed the bell-curve shape of the normal distribution.

This is a critically important observation, because we know that the normal

distribution has certain properties. We know that if we took just one of our

surveys of the average male height in the UK, and wanted to use this as an

estimate of the true height of the male population in the UK, then we could make

certain statements about the accuracy of our single survey. Speci¬cally, we know

that there is a 68% chance that our estimate of height of the UK male population

from our single survey will fall within plus or minus one standard error of the

mean. The standard error is a way of measuring the variability in the population.

If every man was the same height, clearly, there would be no risk of an error.

Then, by applying a simple formula, we can say that we are con¬dent that,

68 times out of 100, our single survey estimate will be accurate to within a

particular margin. If we wish to increase the con¬dence with which we make

our statement, we will need to broaden the error margin (standard errors) within

which we are operating. The convention in commercial survey research is to

present the statistics at the 95% level of con¬dence. This allows us to state how

con¬dent we are of the survey height of our sample of men falling within plus

or minus two standard errors of the mean.

131

Conducting probability sampling in practice

This body of sampling knowledge is important because it not only allows us

to make an estimate of the likelihood of our own sample of the height of men

being close to the true population height, but it tells us that, as the size of the

sample increases, the error margin, within which we interpret our survey statistic,

will decrease. Conversely, as we reduce the size of the sample survey, so the

error margin, within which we interpret our survey statistic, will increase.

So, without drawing the reader into the details of exactly how these calculations

work, the central point to make is that ˜classic™ statistical theory provides us with a

way of interpreting different survey statistics, drawn from varying sizes of sample,

within the appropriate range of error, and at a level of con¬dence (usually the

95% level). In the Notes section we provide a source for checking within what

range to interpret statistics drawn from different sample sizes.

Conducting probability sampling in practice

Understanding the ˜constraints™ imposed by what sampling theory tells us is an

important start point for the holistic data analyst in looking at sample survey

data. The holistic analyst will then take the parameters, prescribed by sampling

theory, one stage further. The holistic data analyst will now take into account a

range of practical and pragmatic considerations about the realities of drawing a

commercial sample in real life.

For instance, we know that the theoretical calculation of the estimate of the true

statistic for the wider universe under investigation provides a ¬‚attering estimate

of the range of error within which the statistic should be interpreted. Thus, if,

for example, a survey of 1000 London car commuters showed that 30% were

resistant to the idea of congestion charging, we will ¬nd that the simple statistical

formula for pure random sampling will tell us that, with this size of sample, a

survey statistic of 30% needs to be interpreted within a range of plus or minus

approximately 3% (at the 95% level of con¬dence). That is to say, with this size

of sample, we are con¬dent, 95 times out of 100, that a true survey statistic will

fall in the range of 27% to 33%.

The experienced data analyst, however, will realize that the above estimate is

based on various, dif¬cult to achieve in practice, assumptions about the way the

sample was drawn and, as such, means that, in most cases, we cannot simply

apply ˜pure™ (simple) sampling theory. This is because our theoretical-based

estimate fails to take into account many of the vagaries of real life (sub-optimum)

commercial research sampling practice. So, the practice of locating a statistic

within a theoretically derived boundary is a helpful start, but we then need to be

aware of the way various practicalities will affect the ˜purist™ estimate. These are

reviewed below.

The design factor

The design factor is the concept that helps us to place some parameters on

the way the sampling practicalities have affected our ˜purist™ estimates of the

132 Establishing the interpretation boundary

accuracy of our survey statistic. Speci¬cally, it helps us to measure the impact

that various practical adaptations and modi¬cations to the pure, simple random

probability sampling method have had on our estimate of the accuracy of our

survey statistic. These modi¬cations include the following.

Strati¬cation

One common modi¬cation to pure random sampling is strati¬cation. This works

to the advantage of the researcher, by helping to reduce the sampling error. This

means that it is good practice, when structuring a sample, to organize (or stratify)

the sampling frame (that is, the list of potential respondents for our survey) in a

way that will guarantee that the sample will represent the key segments of the

overall population.

For example, if we know the proportion of people who live in the North West,

as opposed to the South West, of England, then it is sensible to organize our

sample into strata (all of the individuals living in the North West and South West

and so on), and then select our sample by applying the sample interval within

each of these strata. By following this process for all of the relevant strata, we

are effectively building in a guarantee that our sample will be accurate. This is

because we know, in advance, that we will have the appropriate proportions of

all the relevant constituent elements. In sum, this process of strati¬cation helps to

reduce the size of the sampling error, because it reduces the chances of drawing

a ˜¬‚uke™ sample by chance.

Clustering

Strati¬cation™s positive impact on the sampling process could be countered by

the fact that, in most commercial samples, there will also be some ˜clustering™

of interviews, and this will increase the sampling error. Clustering re¬‚ects the

fact that it will not be practicable to interview people spread across the UK. So,

virtually all commercial survey samples will introduce a process of clustering “ or

grouping “ interviews into convenient to access groups. (Although this, of course,

is much less of an issue with telephone and Internet surveys.)

This process of moving from a totally unclustered sample to a more practical

clustered interviewing design is a major departure from the assumptions under-

pinning pure, simple random sampling. Clustering will markedly increase the

sampling error. Intuitively, we can see why: clearly, there is a risk associated

with asking a (small) clustered group of individuals to represent the wider, more

scattered, universe. It is like asking just those who live in one block of ¬‚ats to

represent the views of the wider community.

Calculating the ˜design factor™

So, to summarize, commercial survey designers will stratify their samples, while

also attempting “ for face-to-face surveys “ to strike a balance between introduc-

ing clustering in order to save interviewers™ travelling costs, but not dramatically

133

Conducting probability sampling in practice

increasing the sampling error. It is the impact of these practical modi¬ca-

tions “ strati¬cation and clustering “ on the ˜classic™ sampling error, that are

collectively referred to as the ˜design factor™.

Doing the exact calculation of the design factor is complex, so some guidelines

are helpful. First, it is worth noting that, even with some of the most prestigious

UK surveys, where probability-based sampling and the minimum of clustering

are employed, the design factor is still in the region of 1.5. (The analyst takes

the sampling error calculated, based on the assumption of pure simple random

sampling, and then multiplies this estimate by the design factor ¬gure of 1.5 to

arrive at a ˜realistic™ estimate of sampling error.) So, given that few commercial

surveys aspire to the standard of well-funded prestigious surveys, then a typical

probability-based sample survey will often carry a design factor of 2.0. (This

is before we arrive at the issue of less rigorous “ non-probability “ sampling

methods, which we discuss below.)

Sample bias

Understanding the extent of the sampling error for a statistic generated from a

probability sample still only tells us part of the story. We also need to understand

the concept of sampling bias. This, the reader will recall, involves understanding:

(a) the extent to which people who should have been included in the sample

were, in fact, included, and (b) the extent to which people who were invited to

take part in the study, did so. Here, empirical experience tells us that meeting

the conditions for true probability sampling “ ensuring that each individual in