in small sample situations “ aimed at ˜domesticating™ the Bayesian approach.

Our thinking here is to match one of the great advantages of classic statistical

con¬dence tests, namely their simplicity, which can be readily applied to any

sample of more than 30 respondents and are comparatively easy to understand.

By contrast, Bayesian tests are very dif¬cult to apply.

146 Establishing the interpretation boundary

So our approach is to employ orthodox signi¬cance tests as our guide where

we have a large sample, together with what we know about sub-group sample

sizes, but to employ simple Bayesian tests as our second line of justi¬cation when

looking at smaller base sizes (of say between 25 and 75). This helps us identify

patterns in data which are not suf¬ciently pronounced to cross the ˜¬nishing line™

of orthodox tests.

Our pragmatic solution is of value because it integrates classic and Bayesian

tests to achieve the simplicity of con¬dence interval tests, whilst enabling the

analyst to factor prior knowledge into the assessment, thereby ensuring we arrive

at a true picture of the way things are. We have embedded this thinking into a

simple spreadsheet, which enables researchers to integrate prior knowledge “ in

the form of prior probabilities “ into the con¬dence limit tests of orthodox

statistics. This gives us something analogous “ in Bayesian terms “ to the standard

error calculations for the sample sizes that are now used.

The method simply inverts the usual con¬dence interval test to yield a single

probability measure that an observed difference is, in fact, signi¬cant. Thus,

instead of setting an arbitrary ˜¬nishing post™ of 95% or 99% and a required

spread between two ¬gures for a difference to be considered signi¬cant, the

inverted methodology produces a single measure of probability that an observed

difference between two ¬gures is, in fact, signi¬cant. Thus, it may give the

answer that a difference of 7% between two sub-samples of a particular size

is 78% likely to be signi¬cant, rather than simply saying that the difference is

not signi¬cant at the 95% con¬dence level and leaving it at that. Getting this

single ¬gure enables us to take the analysis a step further, and integrate our

relevant prior knowledge about the situation to which the ¬gures refer in a

formal way.

Let us say that we are about 75% certain, based on what else we know, that a

¬gure does indeed re¬‚ect a genuine difference. We can then use a very simple

version of Bayes™ formula to integrate the 78% likelihood from the inverted

classical con¬dence test into our 75% prior knowledge. If we do this, we get a

posterior probability of 91%. Thus, our overall con¬dence in the data has risen to

91%. Many readers may be surprised that the probability has gone up, expecting

the combination of two probabilities to result in a lower probability. The fact

that our intuitive response to the idea of combining probabilities tends to lead us

astray demonstrates the importance of disciplining such in¬‚uences with a simple

formal tool such as this.

In the related training module to support this book “ available on the Wiley

website “ readers will ¬nd details of how to obtain access to the model. In

Figure 10.2 we provide a conceptual overview of how the model works.

We conclude this chapter with an example of how the ˜constraints & enablers™

can work together in establishing the overall interpretation boundary for a

multiple data set. Here, we return to our example of assessing the attitudes

of small and medium sized UK businesses (SMEs) towards joining the Single

European Currency (Table 10.3).

147

Taking the enabler concept forward

Prior probability that this difference is significant

based on our prior knowledge and expectation 75% (A)

Sub-group 1 Sub-group 2

Sample 115 220

Statistic 48% 55%

Actual difference 7%

78% (B)

Conditional probability based on sampling error

91%

Posterior probability that this difference

is significant (A x B)

Figure 10.2 “ Overview of a Bayesian directional estimator for two sub-samples.

Table 10.3 “ An illustrative example of ˜constraints & enablers™ “ Single European

Currency example

• The evidence: let us say we have: secondary data; 30 depth interviews with SME

businesses; 10 interviews with industry experts from the SME sector; 6 focus groups

conducted among small businesses; a telephone quota sample survey of 400 UK

SMEs; and an Internet survey of 1000 small businesses. How would the holistic data

analysts make sense of this multiple data?

• Step one “ applying the theoretical constraints to the qualitative evidence: what we

know about grounded theory allows us to ˜formally™ evaluate the SMEs and expert

depth interviews and focus groups. Here, we could inspect all of the issues that

surface from the different types of qualitative research against what we already

know. This allows us to distinguish consistently raised concepts from what appear

to be outlier observations, not resonant with existing prior knowledge.

• Step two “ applying the theoretically driven constraints to the quantitative data: we

know from the theory of the theoretical sampling distribution that a statistic of 50%

drawn from a sample of 400 SMEs will be accurate to plus or minus 5%, at the 95%

level of con¬dence (assuming the conditions of simple random sampling). We could

also apply this body of theory to calculate the sampling error for our Internet survey

of 1000 businesses, and also conduct various tests to establish whether there is a

(statistically signi¬cant) difference between different percentages.

• Step three “ applying constraints driven by empirical ˜theory™: there is also the body of

empirical evidence that allows us to interpret the implications of the ˜response/strike

rate™ achieved on the telephone quota and Internet sample surveys. Here we know

that ideally the response rate should be 65% (or more), if we are to assume that

respondents are similar in their attitudes to non-respondents. If it is less than this

we will have to factor this ˜sample bias™ into our interpretation of the evidence.

(continued overleaf)

148 Establishing the interpretation boundary

Table 10.3 “ (continued)

• Step four “ utilizing the compensation principle: earlier we introduced the ˜com-

pensation principle™. So now, we could, for example, by referring to our desk

research and secondary evidence begin to ˜compensate™ for any limitations in our

evidence. For instance, the Internet survey may over-represent the more technolog-

ically minded. So we would need to ˜adjust™ for this potential bias by setting it in the

context of ˜triangulating™ our survey data with other available evidence.

• Step ¬ve “ applying the ˜enablers™ to the qualitative and quantitative evidence: we can

now, in the spirit of holistic data analysis, begin to ˜stretch™ the above boundary “ the

constraints “ laid down by the above theoretical and empirical principles. This could

be conducted at two levels:

” Level one “ applying the ˜directional indicators™ principle: we could look at how

the evidence from our different sources is coming together to create a general

˜shape or pattern™, that begins to tell the SME ˜story™. In an informal way, we

could examine how our data ˜¬ts™ with existing prior knowledge on the subject

of SMEs and Europe. This understanding of the ¬t between the pattern of our

data, and the wider shape of the contextualizing evidence, could give us the

con¬dence to embrace evidence that, while not statistically signi¬cant, does

seem to be important, when set in the context of our prior knowledge. We will

attach importance to evidence that is directionally consistent, and squares with

our current broader prior knowledge on the topic

” Level two “ the more formal application of Bayesian thinking: we could, on

occasion, take the above Bayesian thinking to a more formal level. To ˜stretch™

our data beyond the boundaries driven by orthodox analysis, we could work

with some of the statistical concepts and principles outlined at the end of this

chapter.

• So, in sum, the holistic data analyst will always use orthodox ˜constraints™ as the

bedrock of their evaluation. Then, to enhance their interpretation and the quality of

the end decision-making process, they will see how far this take on the data can

then be legitimately stretched by our enablers “ factoring in what we already know.

Having introduced the reader to the holistic analysis concepts of using enablers

to stretch the initially constrained statistical data, we now turn to the way in

which the holistic data analyst will apply a range of ˜analytical™ or ˜knowledge

¬lters™ in beginning to make sense of what surveys “ respondents “ are really