ence in shaping membership functions using presently widely accepted logics, and

changing the logics used would invalidate that experience. We do not suggest such a

change for fuzzy control problems. There is no such backlog of experience in hand-

ling non-numeric data; in this case, we suggest using membership functions that add

A membership function Small OR Medium, using correlation logic.

Figure 5.4

TEAM LinG - Live, Informative, Non-cost and Genuine !

92 COMBINING UNCERTAINTIES

Weighted membership functions Small and Medium, ORd by min “ max logic.

Figure 5.5

to one at every point, and using the bounded sum/difference logics when combining

membership functions.

FLOPS does not make the bounded sum operator available as such to the user,

although it is not dif¬cult to compute the result and the reset command can be

used to set the antecedent con¬dence to the bounded sum con¬dence. However,

in computing the results of fuzzy comparisons involving the OR operator

(x ,¼ Y or x .¼ Y ), the bounded sum operator is used internally in a

fashion transparent to the user.

There is an open question as to which logical OR should be used when combining

weighted membership functions prior to defuzzi¬cation. The Zadehian OR (max) is

commonly used for this purpose, but this tends to produced notched membership

functions similar to the notch in Figure 5.2. Control engineers are primarily inter-

ested in shaping a response surface, and usually use ad hoc methods; these

notches have not prevented them from obtaining a desired response. If, however,

we are interested in the combined curves to obtain grades of membership in a

concept rather than in a defuzzi¬ed value, the notches may be worse than annoying.

In general, we feel that for a reasonable family of membership functions, the mem-

bership function for the entire linguistic value obtained by ORing weighted individ-

ual functions should be convex. We de¬ne “reasonable” to be that adjacent functions

should intersect at no less than the 0.5 membership value, and that the ordered

grades of membership of the individual linguistic values should also be convex.

This subject is for future research.

5.4 BAYESIAN METHODS

Reverend Thomas Bayes derived his theorem in the eighteenth century, and

although mathematically irrefutable it has been attended by controversy ever

since. The eminent statistician R. A. Fisher stated that it was his opinion that

Bayesian methods are founded upon error, and should be totally rejected. It was

the pleasure of one of us to attend a meeting at which two papers on Bayesian

TEAM LinG - Live, Informative, Non-cost and Genuine !

93

5.4 BAYESIAN METHODS

methods were presented. The ¬rst paper was titled “The applicability of Bayes™

theorem to problems of medical diagnosis”; the second, that followed immediately,

was titled “The inapplicability of Bayes™ theorem to problems of medical diagno-

sis”. Bayes™ theorem requires knowledge of prior conditional probabilities, but

often that knowledge is lacking. What is in question is Bayes™ assumption of the

“equal distribution of ignorance” if prior probabilities are not known.

Bayes™ theorem has achieved some success in expert systems in the last 20 years,

and we will brie¬‚y describe it here.

The rule itself is based on conditional probabilities. We write the probability the

B is true given that P is true as p(BjP). Bayes™ rule reverses this, and is

p(Ak jB) ¼ p(BjAk ) p(Ak )= p(B) (5:4)

P

or eqivalently, since p(B) ¼ p(BjAi) p(Ai),

P

p(Ak jB) ¼ p(BjAk ) p(Ak )= p(BjAi ) p(Ai ) (5:5)

In applying Bayes™ rule to inference, we say that Ak is a hypothesis we wish to test.

With the information we have acquired to date, p(Ak) is the probability that Ak is

true. Now, we uncover a new piece of evidence B. We know from past experience

that the probability that B is true varies with all the various possible hypotheses Ai ,

and have a table of prior knowledge of p(BjAi) and p(Ai) for all possible Ai. This

prior knowledge permits us to write Table 5.4.

Table 5.4 shows that our con¬dence in Ai can either increase or decrease as a

result of the new evidence B. It also illustrates the problem with Bayesian

methods; where do we get all that prior knowledge? Our table assumes a very

simple problem, with only one new piece of evidence and only three possible

hypotheses. In the real world, we almost always have many more pieces of evidence

and many more hypotheses. It is not often that we can accumulate reliable ¬gures for

the prior knowledge Bayes™ theorem requires. Instead, subjective estimates are made

or some simple rule applied, such as the “equal distribution of ignorance”, which

assumes that all possibilities are equally likely. It is to the “equal distribution of

ignorance” that Fisher objected so violently.

TABLE 5.4 Sample Application of Bayes™ Rule to Updating

Con¬dence in P from New Evidence B

p°BjAi Þp°Ai Þ

p°Ai jBÞ ¼ P

p(BjAi) p(Ai) p(BjAi) p(Ai) p°BjAk p°Ak ÞÞ

0.2 0.3 0.06 0.15

0.4 0.5 0.2 0.5

0.7 0.2 0.14 0.35

TEAM LinG - Live, Informative, Non-cost and Genuine !

94 COMBINING UNCERTAINTIES

5.5 THE DEMPSTER “ SHAFER METHOD

Dempster “ Shafer methods (Dempster, 1967) use dual truth values: a lower level,

called belief, representing the extent to which the evidence supports a hypothesis;

and an upper level, called plausibility, representing the extent to which the evidence

fails to refute the hypothesis. These are closely analogous to the dual measures

necessity and possibility in fuzzy systems theory. The method is concerned with

combining evidence regarding the truth of a hypothesis from different sources.

Our presentation here is paraphrased from that of Jackson (1999), Chapter 21. We

seek to establish belief and plausibility of some set of hypotheses from evidence.

The representation and manipulation of possibility and necessity in rule-based

systems will be taken up in Chapter 8.

A hypothesis space in Dempster“Shafer theory is represented by Q, a space that

holds all the individual hypotheses hi. All hypotheses are assumed to be mutually exclu-

sive, and the set of hypotheses Q is assumed to be exhaustive. We assume that it is poss-

ible to obtain evidence that each single subset of Q, A1, A2, . . . , is true. (A subset Ai may

be a single hypothesis, or may be the entire hypothesis set Q.) The hypotheses in each

subset may overlap those in other subsets. We also have pieces of evidence yi, included

in a set C. Each piece of evidence will point to a subset Ai of Q that holds all the hypoth-

eses that are supported by yj; the subset Ai to which yj points is called a focal element.

Since the hypotheses are exhaustive, that is, that there is at least one hypothesis consistent

with every evidence, no evidence will point to a null set.

Key to the Dempster “ Shafer method is the idea of a probability assignment. A

basic probability assignment (bpa) is de¬ned as a function m(Ai) that maps each

subset Ai of the hypotheses to a value included in [0, 1]. The sum of all m(Ai)

over all subsets of Q is 1. The belief Bel in any focal element A is the sum of all

the basic probability assignments for all subsets of A:

X

m(B) (5:6)

Bel(A) ¼

B.A