11.02.2014 Views

Chapter X: Introduction to Fuzzy Set Theory Uncertainty is universal ...

Chapter X: Introduction to Fuzzy Set Theory Uncertainty is universal ...

Chapter X: Introduction to Fuzzy Set Theory Uncertainty is universal ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>Chapter</strong> X: <strong>Introduction</strong> <strong>to</strong> <strong>Fuzzy</strong> <strong>Set</strong> <strong>Theory</strong><br />

<strong>Uncertainty</strong> <strong>is</strong> <strong>universal</strong>! Take for example a computer system whose task <strong>is</strong> <strong>to</strong> recognize trees in a v<strong>is</strong>ual<br />

image. Sources of uncertainty in th<strong>is</strong> task include (but are not limited <strong>to</strong>): no<strong>is</strong>e in the sensed imagery,<br />

d<strong>is</strong><strong>to</strong>rtion due <strong>to</strong> pose and lens conditions, variability of the class of interest (what <strong>is</strong> a “tree”?),<br />

faithfulness of the features used <strong>to</strong> described a tree, m<strong>is</strong>sing features, spatial context (a tree in a forest vs.<br />

a tree in New York City), temporal context (a tree in summer vs. a tree in winter), the choice of<br />

recognition algorithm, and so on. If mult<strong>is</strong>pectral (or hyperspectral) imagery <strong>is</strong> available or multiple<br />

algorithms are applied <strong>to</strong> the dec<strong>is</strong>ion making aspect, then the problems of how <strong>to</strong> fuse compensa<strong>to</strong>ry or<br />

even conflicting information becomes important. The h<strong>is</strong><strong>to</strong>rical framework for dealing with uncertainty<br />

has been probability theory. Th<strong>is</strong> <strong>is</strong> a powerful <strong>to</strong>ol that has served science well in modeling situations<br />

where the primary source of uncertainty <strong>is</strong> randomness. In some instances, we argue that uncertainty<br />

takes other forms. In considering a new gene sequence, for example, it may be important <strong>to</strong> know how<br />

similar it <strong>is</strong> <strong>to</strong> a particular sequence – if it <strong>is</strong> very similar, it probably has the same function, if it <strong>is</strong> less<br />

similar, it may produce a different effect. It <strong>is</strong> not so much a question of “whether or not” the two genes<br />

are the same, as it <strong>is</strong> a question of “how much” th<strong>is</strong> particular instance of the new gene resembles a<br />

pro<strong>to</strong>type. The probability that “I ate a ham sandwich for lunch” <strong>is</strong> ½ vs. “I ate about half of the ham<br />

sandwich for lunch”. Potable liquid example? Maybe use peaches, plums, nectarines?????<br />

In many cases, there <strong>is</strong> a lack of clear boundaries between classes of objects – when does an image object<br />

s<strong>to</strong>p being a tree and become, say, a bush? In doing some landscaping, we planted a Crepe Myrtle, which<br />

turns out <strong>to</strong> be both a bush and a tree, i.e., there are objects that “completely” belong <strong>to</strong> multiple classes.<br />

In these situations, alternate methodologies should be utilized <strong>to</strong> aid us in making au<strong>to</strong>mated evaluations.<br />

<strong>Fuzzy</strong> set theory and fuzzy logic provide a different way <strong>to</strong> view the problem of modeling uncertainty and<br />

offer a wide range of computational <strong>to</strong>ols <strong>to</strong> aid dec<strong>is</strong>ion making. Clearly, it <strong>is</strong> not our intention <strong>to</strong><br />

dimin<strong>is</strong>h the vital role of probabil<strong>is</strong>tic models in science and engineering. <strong>Fuzzy</strong> set theory and fuzzy<br />

logic provide complementary information <strong>to</strong> that which comes from a probabil<strong>is</strong>tic view.<br />

Many science and engineering problems are formulated in a determin<strong>is</strong>tic manner. Most of these<br />

problems are defined by fixed objective functions and solved through optimization. Many dynamic<br />

processes are modeled using differential equations with determin<strong>is</strong>tic behavior. However, there are at<br />

least three situations in which fuzziness should be considered, i.e., intrinsic fuzziness in biological<br />

systems, multiple roles of a biological object, and fuzzy descriptions of biological phenomena.


Another type of fuzziness in biomedical research results from the fuzzy description of biological terms.<br />

Our descriptions of many biological concepts often have difficulty fitting in<strong>to</strong> a determin<strong>is</strong>tic (cr<strong>is</strong>p)<br />

explanation. As a result, our knowledge, concepts, and representations of biological terms may also be<br />

fuzzy, and fuzzy set theory <strong>is</strong> useful <strong>to</strong> describe these terms. For example, the species definitions for<br />

microbes can be fuzzy due <strong>to</strong> recombination of the genetic materials across species [Hanage et al., 2005].<br />

The concept of “protein function” <strong>is</strong> sometimes fuzzy because it <strong>is</strong> often based on whimsical terms or<br />

contradic<strong>to</strong>ry nomenclature [Jansen and Gerstein, 2004]. Th<strong>is</strong> currently presents a challenge for<br />

functional genomics. In addition, descriptions for similarity and typicality can be fuzzy. For example,<br />

how much do two proteins resemble each other, what properties do they (partially) share, how close <strong>is</strong> a<br />

given protein <strong>to</strong> the pro<strong>to</strong>typical sequence of a protein family, etc. Such fuzziness could result from the<br />

limitations of classifications, natural language, or poor understanding of the underlying mechan<strong>is</strong>m.<br />

Tolerance of fuzziness allows us <strong>to</strong> explore these biological concepts effectively.<br />

But we still ask the question: do we really need fuzziness? If the world were determin<strong>is</strong>tic, the answer<br />

would be no. Boolean logic and probability theory would clearly suffice. Th<strong>is</strong> <strong>is</strong> the “balls in the urn”<br />

world; you know, where you put 12 red balls and 7 green balls in<strong>to</strong> an urn and ask questions like “If you<br />

pick out 5 balls, what’s the probability that they are all red?” If however, you put balls of varying radii<br />

in<strong>to</strong> the urn and picked out 5, what’s the probability that they are all SMALL? Here the event, SMALL<br />

balls, <strong>is</strong> ambiguous. You can convert th<strong>is</strong> in<strong>to</strong> the former case if you set a threshold radius below which a<br />

ball <strong>is</strong> considered SMALL, but then 2 balls just on each side of the threshold will feel the same, but one<br />

will be SMALL and the other Not SMALL. So, establ<strong>is</strong>hing such a threshold doesn’t match well with<br />

human intuition in ambiguous cases. It would be nice <strong>to</strong> have models and calculi that handle these<br />

situations.<br />

<strong>Fuzzy</strong> set theory in general and fuzzy logic specifically are natural ways <strong>to</strong> model ambiguous events that<br />

occur in human-like reasoning. People have no trouble operating with phrases such as “large r<strong>is</strong>k fac<strong>to</strong>r”,<br />

“somewhat likely <strong>to</strong> be involved in cancer”, “significantly hyper methylated”, etc. As will be seen, rules<br />

containing such ambiguous clauses can be successfully handled in a fuzzy logic system.<br />

The beauty (and also a danger, if we are not careful) of fuzzy set theory <strong>is</strong> that it offers a multitude of<br />

calculi for the fusion of partial support for a hypothes<strong>is</strong> under investigation, that <strong>is</strong>, flexible mechan<strong>is</strong>ms<br />

<strong>to</strong> increase or decrease confidence in a dec<strong>is</strong>ion as evidence unfolds. In h<strong>is</strong> seminal text on computer<br />

v<strong>is</strong>ion, [Marr, 1982], David Marr stated two principles <strong>to</strong> be followed in the design of intelligent (v<strong>is</strong>ion)<br />

algorithms. The first <strong>is</strong> called the Principle of Least Commitment (PLC). He states it simply as “Don’t do<br />

something that later must be undone”. Hence, in a complex computing scenario, one where there are<br />

many dec<strong>is</strong>ion making steps, avoid making determin<strong>is</strong>tic dec<strong>is</strong>ions for as long as <strong>is</strong> possible. It <strong>is</strong> very


difficult, perhaps impossible, <strong>to</strong> recover from a wrong cr<strong>is</strong>p dec<strong>is</strong>ion early on. Keep your options open<br />

until the situation demands a final answer. While Marr was interested in computer v<strong>is</strong>ion, the PLC<br />

certainly applies <strong>to</strong> bioinformatics applications. Clearly, the concept of assigning and maintaining<br />

degrees of membership (perhaps confidence in competing hypotheses) or more general lingu<strong>is</strong>tic labels in<br />

fuzzy set theory support the PLC for complex dec<strong>is</strong>ion making applications as in bioinformatics.<br />

The second principle of Marr <strong>is</strong> called the Principle of Graceful Degradation (PGD). By th<strong>is</strong> he meant<br />

that algorithms should deliver a partial (reasonable) answer as input degrades. In other words, intelligent<br />

algorithms should encompass a degree of robustness and continuity. Here also, techniques that utilize<br />

membership degrees or other fuzzy constructs in the calculation of their response <strong>to</strong> input conditions have<br />

the potential <strong>to</strong> degrade much more gracefully than their cr<strong>is</strong>p counterparts. Consider, for example, a<br />

simple cancer screening task that relies on a single test value, say the amount of expression of a particular<br />

gene in a microarray assay (not overly real<strong>is</strong>tic, but for illustrative purposes). If the outcome <strong>is</strong> binary<br />

(cancer/non-cancer) and the algorithm <strong>is</strong> cr<strong>is</strong>p, say based on a threshold, then a slight amount of no<strong>is</strong>e in<br />

the test score can actually flip the screening result from non-cancer <strong>to</strong> cancer or vice versa. If however<br />

the output <strong>is</strong> a membership in the cancer diagnos<strong>is</strong> modeled as a trapezoid function around the threshold<br />

value [Klir and Yuan, 1995], such small perturbations only change the degree of r<strong>is</strong>k in a correspondingly<br />

small amount. Figure X.xx shows a typical trapezoidal curve, along with other standard fuzzy<br />

membership functions. Th<strong>is</strong> overly simple example illustrates the point that fuzzy models embrace the<br />

concept of the PGD.<br />

X.3 Brief H<strong>is</strong><strong>to</strong>ry of the Field<br />

Concepts of vagueness and fuzziness have been contemplated in mathematics and science for quite<br />

awhile. For example, in 1923 Bertrand Russell stated “All traditional logic habitually assumes that<br />

prec<strong>is</strong>e symbols are being employed. It <strong>is</strong> therefore not applicable <strong>to</strong> th<strong>is</strong> terrestrial life, but only <strong>to</strong> an<br />

imagined celestial ex<strong>is</strong>tence.” [Russell, 1923]. Like Russell, the philosopher Max Black was concerned<br />

with vagueness and imprec<strong>is</strong>ion in language, and the effect of these concepts on logic [Black, 1937]. In<br />

fact, he believed that all terms whose application involves using our senses are vague. Black, in 1937,<br />

actually came up with the concept that we now associate with membership functions. He even conducted<br />

a cognitive psychological experiment with a group of people that effectively constructed membership<br />

functions exemplifying vagueness of certain words. However, most people attribute the beginning of<br />

fuzzy set theory <strong>to</strong> Lotfi A. Zadeh’s 1965 paper [Zadeh, 1965] that developed th<strong>is</strong> <strong>to</strong>pic in its current<br />

form. An excellent treatment of the h<strong>is</strong><strong>to</strong>ry of fuzzy sets and fuzzy logic can be found in [Se<strong>is</strong>ing, 2005].<br />

The journal <strong>Fuzzy</strong> <strong>Set</strong>s and Systems publ<strong>is</strong>hed a “40th Anniversary of <strong>Fuzzy</strong> <strong>Set</strong>s” in December 2005


that contains 14 position papers covering various aspects of the role and future prospects of fuzzy sets<br />

[Dubo<strong>is</strong>, 2005].<br />

The mathematical bas<strong>is</strong> for formal fuzzy logic can be found in infinite-valued logics, first studied by the<br />

Pol<strong>is</strong>h logician Jan Lukasiewicz in the 1920s (see [Borkowski, 1970]). Lukasiewicz constructed a series<br />

of multi-valued logical systems, generalizing from small finite numbers of truth-values <strong>to</strong> those<br />

containing infinite sets of truth values. H<strong>is</strong> work and calculation formulae are ingrained in modern fuzzy<br />

set theory and fuzzy logic, the genes<strong>is</strong> of which <strong>is</strong> credited <strong>to</strong> Zadeh in h<strong>is</strong> seminal three part treat<strong>is</strong>e on<br />

the theory and applications of lingu<strong>is</strong>tic variables [Zadeh, 1975a; Zadeh 1975b; Zadeh 1976].<br />

Perhaps the biggest boost <strong>to</strong> the v<strong>is</strong>ibility and perceived utility of fuzzy set theory came from the<br />

application of rule-based fuzzy systems <strong>to</strong> problems in control [Mamdani and Assilian, 1975; Mamdani,<br />

1977; Takagi and Sugeno, 1985; Sugeno, 1985, Verbruggen and Babuska, 1999, Passino and Yurkovich,<br />

1998]. In what has become commonplace now, sets of lingu<strong>is</strong>tically described rules were created and<br />

inserted in<strong>to</strong> a variety of non-linear control systems. The ease of design and the smoothness of the<br />

control surface from only a handful of rules made fuzzy controllers very popular in a variety of products<br />

from the au<strong>to</strong>motive industry, consumer electronics markets, etc. <strong>Fuzzy</strong> controllers are well suited for<br />

low-cost embedded systems.<br />

While the big economic impact of fuzzy set theory and fuzzy logic centers on control, particularly in<br />

consumer electronics, there has been, and continues <strong>to</strong> be, much research and application of these<br />

technologies in pattern recognition, information fusion, data mining, and au<strong>to</strong>mated dec<strong>is</strong>ion making<br />

[Bezdek et al., 1999, Keller et al., 1996]. There are national, multi-national, and international fuzzy<br />

systems professional societies around the globe whose purposes are <strong>to</strong> foster research, development and<br />

application of fuzzy set theory and fuzzy logic. <strong>Fuzzy</strong> systems are one of the core pillars of the IEEE<br />

Computational Intelligence Society.<br />

An introduction <strong>to</strong> the key components of fuzzy set theory and fuzzy logic <strong>is</strong> now given with the view<br />

<strong>to</strong>ward computational methods of use in bioinformatics. After a d<strong>is</strong>cussion of the general principles of<br />

fuzzy set theory, membership functions and fuzzy connective opera<strong>to</strong>rs, we focus on those areas for<br />

which we present specific applications within bioinformatics: fuzzy logic rule based systems, fuzzy<br />

clustering, fuzzy classifiers, particularly, the <strong>Fuzzy</strong> K-Nearest Neighbor algorithm, fuzzy measures and<br />

the fuzzy integral. The reader <strong>is</strong> referred <strong>to</strong> [Klir and Yuan, 1995; Bezdek et al., 1999] for more<br />

extensive development of the theory and selected applications.


X.4 <strong>Fuzzy</strong> membership Functions and Opera<strong>to</strong>rs<br />

X.4.1 Membership functions<br />

Traditional set theory <strong>is</strong> based on binary, or two-valued, logic. Given a “universe” set X, a subset A of X<br />

can be defined in several ways. Suppose that X <strong>is</strong> the set of integers. The subset of prime numbers less<br />

than 10 can be specified by l<strong>is</strong>ting its members:<br />

A = { 2,3,5,7}<br />

or by providing defining properties<br />

{ x ∈X x <strong>is</strong> a positive integer less than 10 and x has only 2 d<strong>is</strong>tinct div<strong>is</strong>ors :1and x }<br />

A = .<br />

Alternately, we define a subset A by its character<strong>is</strong>tic function, which <strong>is</strong> also denoted by the set name,<br />

A : X → {0,1} from X in<strong>to</strong> the binary set {0,1} given by<br />

A (x)<br />

⎧1<br />

= ⎨<br />

⎩0<br />

if x ∈A<br />

if x ∉A.<br />

Zadeh [1965] simply defined a fuzzy subset of X as a function A : X → [0,1]<br />

, i.e., a character<strong>is</strong>tic function<br />

from X in<strong>to</strong> the interval [0,1]. The value A(x) <strong>is</strong> called the membership of the point x in the fuzzy set A<br />

or the degree <strong>to</strong> which the point x belongs <strong>to</strong> the set A. For example, the fuzzy subset of “big positive<br />

integers” could be defined by<br />

A (x)<br />

⎧ 1<br />

⎪1<br />

−<br />

= ⎨ x<br />

⎪<br />

⎩ 0<br />

if x > 0<br />

else.<br />

An example closer <strong>to</strong> the <strong>to</strong>pic of th<strong>is</strong> book <strong>is</strong> a membership function, A, that describes the similarity of a<br />

protein sequence S <strong>to</strong> that of a specific sequence called S’ in a protein database. A typical measure for<br />

sequence similarity <strong>is</strong> the expectation value, Eval, returned from BLAST [Altschul et al., 1997]. Assume<br />

the score for the alignment between S and S’ under a scoring scheme <strong>is</strong> R. Eval indicates the number of<br />

different alignments with scores equivalent <strong>to</strong> or better than R that are expected <strong>to</strong> occur in a database<br />

search by chance. Although expectation value <strong>is</strong> a good descrip<strong>to</strong>r for sequence search from a database, it<br />

does not reflect all the information of the protein similarity. For example, when two sequences are<br />

identical, the expectation value varies accordingly <strong>to</strong> the length, although biologically the two proteins are<br />

the same. Two different long sequences can have better expectation value than two identical short<br />

sequences. To address th<strong>is</strong> <strong>is</strong>sue, we can describe the similarity between any protein sequence and one


target protein sequence from biological point of view using a fuzzy membership function as an alternative<br />

measure, e.g.,<br />

A (S)<br />

⎧ 0 if Eval > 0.1<br />

= ⎨<br />

⎩sim(S,S'<br />

) else<br />

where sim(S,S’) <strong>is</strong> the sequence identity of the alignment, i.e., the percentage of the identical amino acids<br />

in the alignment.<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

1 LOW MEDIUM HIGH<br />

0<br />

0 0.2 0.4 0.6 0.8 1<br />

LOW MEDIUM HIGH<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 0.2 0.4 0.6 0.8 1<br />

(a)<br />

(b)<br />

LOW MEDIUM HIGH<br />

1<br />

0.8<br />

0.6<br />

0.4<br />

0.2<br />

0<br />

0 0.2 0.4 0.6 0.8 1<br />

(c)<br />

Figure X.1 Examples of common fuzzy membership functions where X=[0,1]. (a) triangular, (b)<br />

trapezoidal, and (c) smooth quadratic functions. See Example X.xx and problems X.x and X.xx


(a)<br />

(b)<br />

Figure X.2. Practical membership function genera<strong>to</strong>rs from a nursing application in having children<br />

assess their level of pain for medication dosage. In (a), the child slides a bar <strong>to</strong> the place on the v<strong>is</strong>ual<br />

scale that best represents her level of pain. The nurse can read off the analog “pain membership” value<br />

from the reverse side. (b) Th<strong>is</strong> <strong>is</strong> for younger children and demonstrates a similar, though d<strong>is</strong>crete,<br />

version of pain membership. These data are the property of McNeil Consumer Healthcare and McNeil<br />

maintains sole publication and d<strong>is</strong>semination rights.<br />

All fuzzy set theory <strong>is</strong> based on the concept of a membership function. Where do these membership<br />

functions come from? In many cases, they are defined as in the two examples above – common sense<br />

definitions that convey some lingu<strong>is</strong>tic expression. More generally, they come from expert knowledge<br />

directly or they can be derived from questionnaires, heur<strong>is</strong>tics, etc. Th<strong>is</strong> <strong>is</strong> a human-centric view and <strong>is</strong><br />

certainly open <strong>to</strong> debate. In many cases, the membership functions take on specific functional forms like<br />

triangular, trapezoidal, S-functions, pi-functions, sigmoids, and even Gaussians for convenience in<br />

representation and computation. Pi-functions and S-functions are constructed from quadratic functions<br />

“pieced <strong>to</strong>gether” <strong>to</strong> make smooth curves. Figure X.x d<strong>is</strong>plays several common fuzzy membership<br />

functions. Alternately, membership functions (or the parameters of the specific equation forms) can be<br />

learned from training data, much as probability density functions are learned. Some fuzzy clustering<br />

algorithms naturally produce membership functions as their output. A neural network, given proper<br />

input/output training data, also acts as a membership function for new input.<br />

One of our favorite practical membership functions comes from the field of pediatric nursing. A child <strong>is</strong><br />

asked <strong>to</strong> slide the bar <strong>to</strong> a position indicative of h<strong>is</strong> or her level of pain in Figure 2.3(a). The color and


width provide a guide. The nurse turns the instrument over <strong>to</strong> recover what <strong>is</strong> effectively a membership<br />

(after dividing by 10) in the fuzzy set “Pain”. The goal <strong>is</strong> <strong>to</strong> provide sufficient pain medication without<br />

over dosing. Th<strong>is</strong> <strong>is</strong> a continuous membership. For very young children, see the d<strong>is</strong>crete memberships as<br />

shown in Figure 2.3(b).<br />

Example x.x Definitions of membership functions:<br />

It’s pretty obvious how <strong>to</strong> define triangular and trapezoidal fuzzy membership functions over a real<br />

valued domain, i.e., an interval subset of R, (piecew<strong>is</strong>e linear functions – right?). See problem X.xx. S-<br />

functions are defined by 3 parameters (a,b,c) where a < b < c. The function S(x; a,b,c) <strong>is</strong> required <strong>to</strong> be 0<br />

up <strong>to</strong> a; a parabola that opens up from a <strong>to</strong> b where S(b; a,b,c) = ½; a parabola that “matches up” and<br />

opens downward from b <strong>to</strong> c with S(c; a,b,c) = 1. The equation for such an S-function <strong>is</strong><br />

⎧ 0<br />

⎪ (x − a)<br />

2<br />

⎪<br />

⎪ 2(b − a)<br />

2<br />

S(x;a,b,c) = ⎨<br />

⎪−<br />

(x − c)<br />

2<br />

+ 1<br />

⎪<br />

⎪<br />

2(b − c)<br />

2<br />

⎩ 1<br />

x ≤ a<br />

a < x ≤ b<br />

b < x ≤ c<br />

x > c<br />

Hmmm, how do we get that equation? Do problem X.xx. A Z-function <strong>is</strong> just the “flip” of the S-function,<br />

i.e., Z(x;a,b,c) = 1 – S(x;a,b,c). A Pi function just pieces an S-function with a Z-function <strong>to</strong> look<br />

something like a gaussian, though it actually reaches 0 and <strong>is</strong> made from parabolic sections. It has 6<br />

parameters a < b < c ≤ d < e < f and <strong>is</strong> defined by<br />

Pi(x; a,b,c,d,e,f)<br />

⎧S(x;a,b,c)<br />

⎪<br />

= ⎨ 1<br />

⎪<br />

⎩Z(x;d,e,f )<br />

x ≤ c<br />

c < x < d<br />

x ≥ d<br />

Note that if c < d, a Pi function resembles a “soft” trapezoid. Of course, we can (and do) use properly<br />

scaled Gaussian functions as membership functions in many applications. They have the advantage of<br />

having derivatives of all orders, but they never actually reach 0.<br />

Notice that all of these membership functions have at least one value where the function value <strong>is</strong> 1. A<br />

fuzzy set A whose membership function A(x) “reaches” 1 <strong>is</strong> called normal. To be prec<strong>is</strong>e, define the<br />

height of A by ht(A) = sup {A(x)}<br />

. We use supremum (sup) instead of maximum <strong>to</strong> handle certain<br />

x∈X<br />

1<br />

infinite domain cases, like the log<strong>is</strong>tic membership function A(x) = 1 + e<br />

−x<br />

where the domain X <strong>is</strong> the


set of all real numbers, R. A(x) never actually reaches 1 but <strong>is</strong> asymp<strong>to</strong>tic <strong>to</strong> 1. It’s height <strong>is</strong> 1. If Ht(A)<br />

< 1, we say A <strong>is</strong> subnormal.<br />

X.4.2 Basic fuzzy set opera<strong>to</strong>rs<br />

Once fuzzy subsets of a <strong>universal</strong> set X are defined, definitions for the complement of a set, the union of<br />

two sets and the intersection of two sets are required <strong>to</strong> actually generate a “set theory”. In 1965, Zadeh<br />

proposed the following. Suppose A : X → [0,1]<br />

<strong>is</strong> a fuzzy subset of X. The complement A<br />

c<br />

of A <strong>is</strong><br />

defined by<br />

A c (x)<br />

= 1 − A(x) .<br />

Additionally, if B : X → [0,1]<br />

<strong>is</strong> another fuzzy subset of X, Zadeh defined<br />

and<br />

( A ∪ B)(x) = max{A(x),B(x)} = A(x) ∨ B(x)<br />

( A ∩ B)(x) = min{A(x),B(x)} = A(x) ∧ B(x) .<br />

Why did Zadeh define the opera<strong>to</strong>rs in th<strong>is</strong> manner? Quite simply, it was because these definitions revert<br />

back <strong>to</strong> the standard cr<strong>is</strong>p definitions if the subsets are cr<strong>is</strong>p. Hence, th<strong>is</strong> forms a true extension of normal<br />

set theory. As can be found in the many textbooks on fuzzy set theory (see [Klir and Yuan, 1995;<br />

Pedrycz and Gomide, 1998] for example), all of the theorems of cr<strong>is</strong>p set theory hold for th<strong>is</strong> fuzzy set<br />

theory except two: the Law of Contradiction (LOC) and the Law of the Excluded Middle (LEM). The<br />

LOC states that the intersection of a set and its complement must be empty ( A ∩ A<br />

c = φ ), while the LEM<br />

requires that the union of a set and its complement must be the whole universe set ( A ∪ A<br />

c = X ). Since<br />

cr<strong>is</strong>p set theory <strong>is</strong> formally equivalent <strong>to</strong> the first order predicate logic, these two laws state that a<br />

proposition cannot be both true and not true simultaneously, and that either a proposition or its negation<br />

(complement) must be true. While these statements seem reasonable, they give r<strong>is</strong>e <strong>to</strong> a paradox within<br />

classical logic, commonly called Russell’s paradox. A simple version goes something like th<strong>is</strong>: Russell’s<br />

barber has a sign that states “I shave everyone, and only those, who do not shave themselves”. Then who<br />

shaves the barber? If he shaves himself, then he cannot (he shaves only those who do not shave<br />

themselves); but if he does not shave himself, then he must (since he shaves everyone who don’t shave<br />

themselves). Such a dilemma! As mentioned earlier, the Crepe Myrtle <strong>is</strong> both a tree and a non-tree (it <strong>is</strong><br />

also a bush). Hence, it <strong>is</strong> impossible <strong>to</strong> put it only in<strong>to</strong> one set; it also naturally fit in<strong>to</strong> the complement<br />

set. So, perhaps it <strong>is</strong> not that unreasonable <strong>to</strong> d<strong>is</strong>obey the LOC and the LEM (Can you demonstrate th<strong>is</strong>?<br />

problem x.xx).


Example x.xx.<br />

A Simple law: show that standard fuzzy set theory sat<strong>is</strong>fies the Commutative Law for union, i.e.,<br />

( A ∪ B) = (B ∪ A) . Th<strong>is</strong> will follow from what we know about real numbers. We have <strong>to</strong> show that for<br />

each<br />

x ∈ X , ( A ∪ B)(x) = (B ∪ A)(x)<br />

. Just “follow your nose”:<br />

( A ∪ B)(x) = A(x) ∨ B(x) = B(x) ∨ A(x) = (B ∪ A)(x) . The commutative law for union <strong>is</strong> a result of the<br />

fact that the max of two real numbers <strong>is</strong> commutative.<br />

A little more complicated law: Show that standard fuzzy set theory sat<strong>is</strong>fies the d<strong>is</strong>tributive law:<br />

( A ∪ (B ∩ C) = (A ∪ B) ∩ (A ∪ C) .<br />

To verify th<strong>is</strong> property, we have <strong>to</strong> show that for each<br />

x ∈ X , ( A ∪ (B ∩ C)(x) = ((A ∪ B) ∩ (A ∪ C))(x)<br />

.<br />

First consider the left hand side (LHS) of the conjectured equality.<br />

Now, ( A ∪ (B ∩ C)(x) = A(x) ∨ (B ∩ C)(x) = A(x) ∨ (B(x) ∧ C(x)).<br />

Similarly, the right hand side (RHS)<br />

becomes (( A ∪ B) ∩ (A ∪ C))(x) = (A ∪ B)(x) ∧ (A ∪ C)(x) = (A(x) ∨ B(x)) ∧ (A(x) ∨ C(x))<br />

. The proof<br />

<strong>is</strong> establ<strong>is</strong>hed by looking at all the possible configurations for the 3 values A (x),B(x),and C(x)<br />

:<br />

Case 1. A(x)<br />

≤ B(x) ≤ C(x)<br />

. Here, the LHS becomes A (x) ∨ B(x) = B(x)<br />

and the RHS likew<strong>is</strong>e <strong>is</strong><br />

( A(x) ∨ B(x)) ∧ (A(x) ∨ C(x)) = B(x) ∧ C(x) = B(x) , i.e., for Case 1, LHS = RHS. √<br />

Case 2. A(x)<br />

≤ C(x) ≤ B(x)<br />

. You get the idea, right?<br />

For all configurations, you verify that the operations on the LHS and RHS give the same result. How<br />

many configurations are there for 3 real numbers? (see problem X.xx). It’s tedious, but straightforward <strong>to</strong><br />

verify th<strong>is</strong> law for fuzzy set theory. The proof <strong>is</strong> like the truth table proof done for the cr<strong>is</strong>p version of the<br />

law in a digital logic class (except there, the 3 variables can only take on values of 0 and 1).<br />

Suppose that the membership function values for a particular element x in X are interpreted as the<br />

confidences that x possesses certain properties, e.g., A (x)<br />

<strong>is</strong> the confidence that an image region x <strong>is</strong><br />

LONG and B (x)<br />

corresponds <strong>to</strong> the confidence that x <strong>is</strong> STRAIGHT. Then the original Zadeh<br />

definitions of complement, union, and intersection produce confidences related <strong>to</strong> the lingu<strong>is</strong>tic concepts<br />

of NOT, OR, and AND: A c (x)<br />

<strong>is</strong> the confidence that x <strong>is</strong> NOT LONG; ( A ∪ B)(x)<br />

gives the degree <strong>to</strong><br />

which x <strong>is</strong> either LONG or STRAIGHT (or both); ( A ∩ B)(x)<br />

computes the confidence that x <strong>is</strong> both


LONG and STRAIGHT. Beta-catenin would have a non-zero membership in the intersection of A and<br />

A c . MAYBE USE SILHOUETTES or VOXEL PERSON<br />

The good news and the bad news in fuzzy set theory <strong>is</strong> that there are infinite numbers of ways <strong>to</strong> define<br />

complement, union and intersection [Klir and Yuan, 1995; Dubo<strong>is</strong> and Prade, 1985]. An alternate fuzzy<br />

set theory that <strong>is</strong> useful for fuzzy logic inference <strong>is</strong> generated by the opera<strong>to</strong>rs<br />

( A ∪ b B)(x) = 1 ∧ (A(x) + B(x)) ,<br />

( A ∩ b B)(x) = 0 ∨ (1−<br />

(A(x) + B(x))) ,<br />

called the bounded sum and bounded difference, along with the standard complement,<br />

A c (x)<br />

= 1 − A(x) .<br />

Each such extension of cr<strong>is</strong>p set theory loses either LOC and LEM or two other properties (idempotency<br />

and d<strong>is</strong>tributivity) [Klir and Yuan, 1995]. For th<strong>is</strong> choice, LOC and LEM are sat<strong>is</strong>fied, while<br />

idempotency and d<strong>is</strong>tributivity are lost. Actually, there are infinite families of union, intersection and<br />

complement opera<strong>to</strong>rs that are extremely useful in multicriteria dec<strong>is</strong>ion making where partially<br />

supported criteria are <strong>to</strong> be combined in d<strong>is</strong>junctive (OR) and/or conjunctive (AND) manners <strong>to</strong> reach an<br />

overall evaluation of an alternative.<br />

One such infinite family of connectives <strong>is</strong> due <strong>to</strong> Yager [Yager, 1980]. Here, complement, union and<br />

intersection are given by<br />

A<br />

c<br />

(x) = (1−<br />

A(x)<br />

w<br />

)<br />

1/w<br />

, w ∈(0,<br />

∞)<br />

, (X.1a)<br />

1<br />

( A ∪ B)(x) min{1,(A(x)<br />

w<br />

B(x)<br />

w<br />

) w<br />

w =<br />

+ }, w ∈(0,<br />

∞),<br />

and (X.1b)<br />

1<br />

( A ∩ B)(x) 1 min{1,((1 A(x))<br />

w<br />

(1 B(x))<br />

w<br />

) w<br />

w = − − + − }, . (X.1c)<br />

w ∈(0,<br />

∞).<br />

For all choices of w, the value of the Yager union opera<strong>to</strong>r <strong>is</strong> greater than the standard union (max), while<br />

that for the intersection <strong>is</strong> less than the standard intersection (min). In other words, a Yager union<br />

opera<strong>to</strong>r <strong>is</strong> more optim<strong>is</strong>tic than the maximum (in combining confidence), whereas each Yager<br />

intersection produces values that are more pessim<strong>is</strong>tic than the minimum. The parameter w controls the<br />

degree of optim<strong>is</strong>m or pessim<strong>is</strong>m. In fact, the following limits hold:<br />

lim (A ∪w<br />

B)(x) = A(x) ∨ B(x) and<br />

w→0


lim (A ∩w<br />

B)(x) = A(x) ∧ B(x) .<br />

w→0<br />

At the other end, i.e., the limits as w<br />

→<br />

∞ , generate the drastic union and intersection, defined by<br />

⎧A(x)<br />

if B(x) = 0<br />

⎪<br />

( A ∪d B)(x) = ⎨B(x)<br />

if A(x) = 0 and<br />

⎪<br />

⎩ 1 else<br />

⎧A(x)<br />

if B(x) = 1<br />

⎪<br />

( A ∩d B)(x) = ⎨B(x)<br />

if A(x) = 1.<br />

⎪<br />

⎩ 0 else<br />

Besides their use in what <strong>is</strong> <strong>to</strong> come, fuzzy opera<strong>to</strong>rs have been used extensively in multicriteria dec<strong>is</strong>ion<br />

making [Bellman and Zadeh, 1970; Yager, 1988; Yager, 2004]. We end th<strong>is</strong> section with an example of<br />

the use of fuzzy set opera<strong>to</strong>rs in multicriteria dec<strong>is</strong>ion making.<br />

Example X.1. As a particularly simpl<strong>is</strong>tic illustration, consider a dec<strong>is</strong>ion tree, as in Figure X.3, <strong>to</strong> assess<br />

cancer r<strong>is</strong>k based on the following observations. Suppose we decide that cancer r<strong>is</strong>k should be high if<br />

either internal fac<strong>to</strong>rs or environmental fac<strong>to</strong>rs are high. Th<strong>is</strong> <strong>is</strong> modeled by a union opera<strong>to</strong>r (OR). We<br />

define the internal fac<strong>to</strong>rs <strong>to</strong> be the conjunction (AND) of genetic pred<strong>is</strong>position and genetic test results.<br />

In th<strong>is</strong> particular example, the rationale for using a conjunction might be that for some types of cancer, the<br />

test might be subject <strong>to</strong> a high rate of false positives, and so, these results can be offset by low genetic<br />

propensity. Environmental fac<strong>to</strong>rs are aggregated as the d<strong>is</strong>junction (OR) of amount of smoking,<br />

hazardous work r<strong>is</strong>k and the negation, or complement (NOT), of good nutrition. Clearly, th<strong>is</strong> <strong>is</strong> a gross<br />

oversimplification and <strong>is</strong> included <strong>to</strong> demonstrate the utility of fuzzy opera<strong>to</strong>rs in multicriteria dec<strong>is</strong>ion<br />

making more than focusing on reality. In Figure X.3, we model each of the opera<strong>to</strong>rs with the<br />

corresponding Yager connective (Equation X.1). The parameters for these four connectives will be<br />

labeled w 1 for the <strong>to</strong>p d<strong>is</strong>junction, w 2 for the conjunction of internal fac<strong>to</strong>rs, w 3 for the d<strong>is</strong>junction of<br />

external fac<strong>to</strong>rs and w 4 for the complement.<br />

Given the tree in Figure 2.4, suppose that we have determined the following fuzzy membership values for<br />

the leaf nodes: propensity = 0.2, test results = 0.8, smoking r<strong>is</strong>k = 0, job r<strong>is</strong>k = 0.1, and good nutrition =<br />

0.8. That <strong>is</strong>, a particular patient has a low genetic propensity but a reasonably high likelihood from a test,<br />

while living a good lifestyle. If the logical opera<strong>to</strong>rs are the classical binary ones, then the r<strong>is</strong>k of cancer<br />

would be zero for th<strong>is</strong> set, assuming that the fuzzy memberships are turned binary at a 0.5 threshold. The<br />

advantage of fuzzy set theory <strong>is</strong> that the opera<strong>to</strong>rs that govern complement, d<strong>is</strong>junction and conjunction


can be tailored <strong>to</strong> reflect different user d<strong>is</strong>positions. Table X.1 d<strong>is</strong>plays the “cancer r<strong>is</strong>k” output for a few<br />

choices of connection parameters. For example, using the parameters on line 3 of Table 2.1, we calculate<br />

the R<strong>is</strong>k as:<br />

R<strong>is</strong>k = min{1,((1 − min{1,((1 − 0.2)<br />

0.5<br />

+ (1 − 0.8)<br />

0.5<br />

)<br />

2<br />

})<br />

0.5<br />

+<br />

(min{1,((min{1,(0<br />

1<br />

+ 0.1<br />

1<br />

)<br />

1/1<br />

})<br />

1<br />

+ ((1 − 0.8<br />

2<br />

)<br />

0.5<br />

)<br />

1<br />

)<br />

1/1<br />

})<br />

0.5<br />

)<br />

2<br />

} = 0.7<br />

Figure X.3 Oversimplified tree structure <strong>to</strong> demonstrate the utility of fuzzy opera<strong>to</strong>rs<br />

The weights in case 1 of Table X.1 produce opera<strong>to</strong>rs that behave like the classical binary ones, and hence<br />

produce r<strong>is</strong>k near zero. If the patient or doc<strong>to</strong>r <strong>is</strong> more aggressive relative <strong>to</strong> assessing r<strong>is</strong>k, Table X.1<br />

provides examples that produce low, moderate and even complete r<strong>is</strong>k for those same inputs. While we<br />

claim that th<strong>is</strong> flexibility <strong>is</strong> an advantage of fuzzy set theory, some may argue that it confuses the<br />

situation. The message <strong>is</strong> that no one should use computational or logical operations on data without<br />

understanding how these opera<strong>to</strong>rs combine the data. By studying fuzzy set connectives (as in [Klir and<br />

Yuan, 1995]), different degrees of aggressiveness can be quantified and produce meaningful tradeoffs <strong>to</strong> a<br />

patient in th<strong>is</strong> case, or for more general multicriteria dec<strong>is</strong>ion making processes.<br />

Table X.1 Cancer r<strong>is</strong>k output for various weights for the input values stated in the text<br />

Parameters w 1 w 2 w 3 w 4 R<strong>is</strong>k<br />

1 0.5 0.5 2.0 0.5 0.1<br />

2 1.0 10.0 1.0 0.5 0.3<br />

3 0.5 0.5 1.0 2.0 0.7


4 0.5 0.5 0.5 2.0 1.0<br />

Table X.2 R<strong>is</strong>k output for various smoking habits for w 1 =1.0, w 2 =10.0, w 3 =1.0, w 4 =0.5 and for the input<br />

values stated in the text.<br />

Smoking 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0<br />

R<strong>is</strong>k 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.0 1.0 1.0<br />

Additionally, for a given choice of parameters, a “what-if” game can be played. In the above example,<br />

with the parameters as in case 2 of Table X.1, we can examine the change in cancer r<strong>is</strong>k given by<br />

changing a patient’s smoking habits, as d<strong>is</strong>played in Table X.2.<br />

x.xx Alpha-cuts, the Decomposition Theorem, and the Extension Principle<br />

Suppose you want <strong>to</strong> take the average, or weighted average of a bunch of senor measurements that are<br />

uncertain. We might model the uncertainty by fuzzy numbers, i.e., by normal, convex fuzzy sets over the<br />

real line. You know what normal fuzzy sets are; intuitively convex fuzzy subsets of the real numbers<br />

have membership function that “go up” for awhile and then “go down”, like triangles, trapezoids, Pifunctions,<br />

and Gaussians, but nothing bi-modal for example. We’ll make th<strong>is</strong> more prec<strong>is</strong>e shortly. So<br />

how do we do arithmetic with fuzzy numbers? The s<strong>to</strong>ry goes like th<strong>is</strong>.<br />

Let A be a fuzzy subset of X. For each ∈(0,1]<br />

A . The cr<strong>is</strong>p set<br />

α<br />

A <strong>is</strong><br />

α , define<br />

α<br />

= { x ∈X A(x) ≥ α}<br />

called the α-cut of A. The set 1 A , i.e., the set of x such that A(x) = 1, <strong>is</strong> called the core of A. Of course,<br />

0<br />

A<br />

A , then<br />

0+<br />

A <strong>is</strong> the set<br />

<strong>is</strong> all of X. If we define the strong α-level set of A by<br />

α+<br />

= { x ∈X A(x) > α}<br />

of all x such that A(x) > 0, known as the support of A. Now we can formalize in the simple case what we<br />

mean by convex fuzzy subsets of the reals, R. They are those fuzzy subsets all of whose α-cuts are cr<strong>is</strong>p<br />

convex subsets of the real numbers (intervals for one dimension). See [xxxKlir] for a more general<br />

definition of a convex fuzzy set along with the theorem connecting that definition <strong>to</strong> α-cuts.


For each α ∈(0,1]<br />

, define a fuzzy set<br />

⎪⎧<br />

α x∈<br />

α<br />

A<br />

( αA)(x)<br />

= ⎨ . Note that for all α except 1, th<strong>is</strong> <strong>is</strong> a<br />

⎪⎩ 0 x∉<br />

α<br />

A<br />

subnormal fuzzy subset.<br />

Why go through all th<strong>is</strong> work? We can decompose a fuzzy set in<strong>to</strong> the “union” of its α-cuts [Klirxxx].<br />

Decomposition Theorem: Let A be a fuzzy subset of X. Then<br />

A<br />

<br />

= where A (x) sup { A(x) }<br />

α A<br />

α∈[0,1]<br />

⎛ ⎞<br />

⎜ ⎟<br />

⎜ α ⎟ = α .<br />

α∈[0,1]<br />

⎝ ⎠<br />

α∈[0,1]<br />

Th<strong>is</strong> theorem (actually pretty straightforward <strong>to</strong> prove) states that if you know the α-cuts of A, then you<br />

know A itself. Th<strong>is</strong> <strong>is</strong> one of the two building blocks <strong>to</strong> perform fuzzy math. The other <strong>is</strong> the extension<br />

principle [Zadeh xxx]. Simply put, given a function f : X → Y between two domains, we can “extend” f<br />

<strong>to</strong> be a function between fuzzy subsets of X and fuzzy subset of Y. Note that you already know how <strong>to</strong><br />

extend f <strong>to</strong> a function on the cr<strong>is</strong>p subsets of X by<br />

f (A)<br />

{ y∈Y<br />

y = f (x) for some x ∈A}<br />

= . The image of a cr<strong>is</strong>p subset A of X <strong>is</strong> again a cr<strong>is</strong>p subset f(A) of<br />

Y. There might be many values of x that map <strong>to</strong> a given y (think of f(x) = x 2 and y = 1), but all you need<br />

for cr<strong>is</strong>p subsets <strong>is</strong> 1such correspondence. Hence, for cr<strong>is</strong>p subsets<br />

⎧1<br />

if y = f (x) for some x ∈A<br />

f (A)(y) = ⎨<br />

. If A <strong>is</strong> a fuzzy subset of X, then the Extension Principle<br />

⎩0<br />

else<br />

states that for a fuzzy subset A of X,<br />

{ A(x) y f (x)}<br />

f (A)(y) = sup = , i.e., you find the “largest” membership in the set of elements of X that<br />

map <strong>to</strong> y. If the domain of f <strong>is</strong> greater than 1, say f:<br />

A 1 , …, A n in their respective domains, the extended function becomes<br />

f (A)(y)<br />

XX).<br />

{ A (x ) ∧ A (x ) ∧ ∧ A (x ) y = f (x ,x ,x )}<br />

X 1 × X2<br />

× ×<br />

Xn<br />

→ Y, and you have fuzzy subsets<br />

= sup 1 1 2 2 n n 1 2, n (we’ll see more of th<strong>is</strong> in chapter<br />

Now we combine these two ingredients <strong>to</strong> get fuzzy number math. While the approach below can extend<br />

<strong>to</strong> many kinds of functions (see the problems), we restrict <strong>to</strong> those needed in computing the weighted<br />

average of fuzzy numbers where the weights themselves are fuzzy numbers, i.e., addition, subtraction,<br />

multiplication and div<strong>is</strong>ion. To perform one of these binary operations on a pair of fuzzy numbers, the<br />

function under consideration looks like #: R×R → R where # <strong>is</strong> one of {+, -, ×, /}. Suppose A and B are<br />

two fuzzy numbers. By the Extension Principle, we could directly compute


{ A(x ) ∧ B(x ) y x # }<br />

( A#B)(y) = sup 1 2 = 1 x2<br />

. Th<strong>is</strong> <strong>is</strong> tedious at best, and in the continuous case involves<br />

solving a nonlinear program for each value of y. However, for fuzzy numbers and the basic arithmetic<br />

opera<strong>to</strong>rs, we have that<br />

α<br />

(A#B) = (<br />

α<br />

A) # (<br />

α<br />

B) . Since the α-cuts of a fuzzy number are closed intervals, computing the a-cut of<br />

the extended arithmetic operation reduces <strong>to</strong> interval arithmetic, something that <strong>is</strong> easy <strong>to</strong> do. Then using<br />

the decomposition theorem, we fin<strong>is</strong>h th<strong>is</strong> off by noting that<br />

⎛<br />

⎞<br />

⎜<br />

⎟<br />

A#B= ⎜ α ( A#B) ⎟<br />

⎝α∈[0,1]<br />

⎠<br />

⎧<br />

⎪<br />

if x - 4 or x 3<br />

x 0<br />

< ><br />

⎪ + 4<br />

⎨<br />

if - 4 x 0<br />

Example X.xx: Let A(x) = - x<br />

4 ≤ ≤<br />

⎪<br />

3 if 0 < x ≤ 3<br />

⎪ +<br />

⎩ 3<br />

and<br />

B(x) =<br />

⎧ 0<br />

⎪<br />

⎨x<br />

+ 2<br />

⎪<br />

⎩ - x<br />

if x < - 2 or x > 0<br />

if - 2<br />

≤ x ≤ -1<br />

if -1<<br />

x ≤ 0<br />

Find 0.5 A and 0.5 B. Compute 0.5 A + 0.5 B = 0.5 (A+B)<br />

From the Extension Principle, give a detailed explanation of how the single value of MAX(A,B)(-3/2) <strong>is</strong><br />

calculated.<br />

NEED PICTURE and EXAMPLE HERE<br />

R<br />

X.4.3 Compensa<strong>to</strong>ry opera<strong>to</strong>rs


Of course, the meaningfulness of the results of an analys<strong>is</strong> as in example X.1 depends on the faithfulness<br />

of the model and the accuracy of assessing the input values. Th<strong>is</strong> problem <strong>is</strong> not specific <strong>to</strong> fuzzy set<br />

theory, but <strong>is</strong> inherent <strong>to</strong> all computational paradigms. The d<strong>is</strong>cussion here makes no overt claims <strong>to</strong><br />

accurately model cancer r<strong>is</strong>k, but <strong>is</strong> only used <strong>to</strong> demonstrate the flexibility of fuzzy connectives in<br />

dec<strong>is</strong>ion processes.<br />

In Figure X.3, we might conjecture that the internal and external fac<strong>to</strong>rs might better be combined in a<br />

compensative manner, i.e., more like an average than a union. Besides modeling negation (NOT),<br />

d<strong>is</strong>junction (OR) and conjunction (AND), fuzzy set theory admits mechan<strong>is</strong>ms <strong>to</strong> model compensa<strong>to</strong>ry<br />

connections, i.e., aggregation opera<strong>to</strong>rs where a high value in matching one criterion can compensate <strong>to</strong><br />

some extent for a low value for another criterion. The simplest of these <strong>is</strong> called the generalized mean. If<br />

a1,a2,...,<br />

an<br />

are the degrees of sat<strong>is</strong>faction of n criteria, the generalized mean <strong>is</strong> defined as:<br />

1/ α<br />

⎛ α α α ⎞<br />

⎜ a + + + ⎟<br />

α<br />

= 1<br />

a<br />

2<br />

an<br />

h (a1,a2,<br />

,an<br />

)<br />

⎜ n ⎟<br />

(X.2)<br />

⎝<br />

⎠<br />

where α <strong>is</strong> a fixed real number. For α = 1, th<strong>is</strong> equation implements the arithmetic average, for α = -1, we<br />

have the harmonic average, and for α converging <strong>to</strong> 0, Equation X.2 produces the geometric mean, the nth<br />

root of the product of the values. All instantiations of the generalized means produce values between the<br />

minimum and maximum of the degrees of sat<strong>is</strong>faction of the individual criteria. Additionally,<br />

{ a , ,<br />

}<br />

lim hα<br />

(a1,<br />

,an<br />

) = min 1 an<br />

α→−∞<br />

{ a , ,<br />

}<br />

lim hα<br />

(a1,<br />

,an<br />

) = max 1 an<br />

α→∞<br />

.<br />

Th<strong>is</strong> high degree of coverage makes fuzzy set connectives appealing for multicriteria dec<strong>is</strong>ion making.<br />

In [Kr<strong>is</strong>hnapuram and Lee, 1992a, 1992b], Yager unions and intersections, along with generalized means,<br />

were used in hierarchical dec<strong>is</strong>ion networks, and a gradient descent-based training algorithm was created<br />

<strong>to</strong> learn the parameters of the connectives in the network from a set of input/output training data.<br />

However, there were fairly cumbersome tests <strong>to</strong> decide if a node should be a union, intersection or mean<br />

(and <strong>to</strong> flip between them). A more general class of connectives, called fuzzy hybrid opera<strong>to</strong>rs, combine<br />

all three types of lingu<strong>is</strong>tic connectives in<strong>to</strong> a single equation. The typical arithmetic and multiplicative<br />

hybrid opera<strong>to</strong>rs are given by:<br />

A ⊕ B = (1 − γ)(A<br />

∩ B)<br />

1−γ<br />

γ + γ(A<br />

∪ B) (X.3)<br />

and


A ⊗<br />

−γ γ<br />

γ B = (A ∩ B)<br />

1<br />

⋅(A<br />

∪ B)<br />

(X.4)<br />

where γ <strong>is</strong> between 0 and 1 and controls the amount of “mixing” of the union and intersection<br />

components, i.e., if γ <strong>is</strong> close <strong>to</strong> 0, the hybrids acts like an intersection, near 1 produces a union-like<br />

response, and for γ around 0.5, the hybrid takes on the character<strong>is</strong>tics of a generalized mean.<br />

Zimmermann and Zysno [1980] proposed a hybrid opera<strong>to</strong>r for multicriteria aggregation that was<br />

modeled after the compensa<strong>to</strong>ry nature of human aggregation. Th<strong>is</strong> hybrid opera<strong>to</strong>r (γ model) <strong>is</strong> an<br />

example of Equation (X.4) and <strong>is</strong> given by:<br />

where,<br />

n<br />

and∑ δ i = n<br />

i=<br />

1<br />

aggregated.<br />

n<br />

n<br />

δ −γ<br />

δ<br />

Y = (<br />

γ<br />

∏(a<br />

) i )<br />

1<br />

(1 −∏(1<br />

− a ) i<br />

i<br />

i ) (X.5)<br />

i=<br />

1<br />

i=<br />

1<br />

a i ∈[0,1] are the criteria sat<strong>is</strong>factions <strong>to</strong> be aggregated, 0 ≤ γ ≤ 1 <strong>is</strong> the mixing coefficient,<br />

. Here, δ i are weights associated with each criterion a i and n <strong>is</strong> the number of criteria being<br />

Y<br />

δ 1<br />

γ<br />

………<br />

Final Node<br />

δ n<br />

Node 1<br />

γ<br />

…….…..……<br />

γ<br />

Node n<br />

δ 1<br />

….<br />

δ n<br />

δ 1<br />

…..<br />

δ n<br />

….<br />

……<br />

a 1 a n a 1 a n<br />

Figure X.4 A fuzzy aggregation network of multiplicative hybrids used in [Parekh and Keller, 2007].<br />

Kr<strong>is</strong>hnapuram and Lee [1992a,b] also developed a back-propagation algorithm <strong>to</strong> learn the parameters of<br />

opera<strong>to</strong>rs of th<strong>is</strong> type in a network-based dec<strong>is</strong>ion application. While the algorithm converged, the


derivatives were quite messy and as with all such algorithms, convergence could only be guaranteed <strong>to</strong> a<br />

local minimum of a least squares fitness function. Keller et al. [1994] extended the approach <strong>to</strong> additive<br />

hybrid networks. In [Parekh and Keller, 2007], particle swarm optimization [Eberhart and Kennedy,<br />

1995; Clerc, 2004] was used <strong>to</strong> train these aggregation networks. Figure X.4 d<strong>is</strong>plays such a network.<br />

The advantage of swarm optimization <strong>is</strong> that many potential solutions (here, the l<strong>is</strong>t of all node<br />

parameters) are randomly generated and through individual particle memory and communication between<br />

particles, large areas of the optimization search space can be covered while still moving quickly <strong>to</strong> a<br />

(usually very good) local optimum of the fitness function. Additionally, in th<strong>is</strong> case, no derivatives were<br />

necessary, since each particle contains all the node parameters and evaluation <strong>is</strong> performed directly at<br />

each time step.<br />

Table X.3 Sample of the 800 training and 200 testing input/output data for learning the parameters of the<br />

nodes in the network of Figure X.4, where each node has 2 inputs<br />

Sample of Training Data<br />

Node 1 Node 2<br />

a1 a2 a1 a2<br />

Y Y' SSE<br />

0.425 0.590 0.655 0.861 0.364 0.363<br />

0.768 0.452 0.629 0.668 0.209 0.211<br />

0.532 0.053 0.521 0.548 0.018 0.018<br />

0.00175<br />

0.235 0.868 0.722 0.892 0.490 0.492<br />

0.673 0.925 0.428 0.829 0.542 0.541<br />

Sample of Test Data<br />

0.467 0.538 0.518 0.990 0.423 0.420<br />

0.771 0.678 0.617 0.999 0.621 0.622<br />

0.810 0.344 0.392 0.109 0.005 0.005<br />

0.00052<br />

0.997 0.644 0.235 0.630 0.242 0.244<br />

0.272 0.032 0.821 0.528 0.009 0.008


Example X.2. As an example, synthetic data was used <strong>to</strong> verify th<strong>is</strong> approach. Parameters for the<br />

multiplicative hybrid opera<strong>to</strong>rs were randomly generated and assigned <strong>to</strong> each node. Then, a table of<br />

1000 input values was randomly generated and corresponding outputs were calculated from successive<br />

applications of Equation X.5. The training data cons<strong>is</strong>ted of 800 data points and the test data had 200 data<br />

points. Table X.3 shows a sample of the training and test data from one experiment while Table X.4<br />

shows the original and recovered hybrid parameters. With an easy and effective training mechan<strong>is</strong>m,<br />

such fuzzy aggregation networks are attractive <strong>to</strong>ols for hierarchical confidence fusion. Besides the<br />

ability <strong>to</strong> approximate input/output training data, an additional advantage of these networks <strong>is</strong> that after<br />

training, each node can be associated with a lingu<strong>is</strong>tic connective (d<strong>is</strong>junction, conjunction, mean), based<br />

on the corresponding value of γ, and the weights give an indication of the importance of the particular<br />

criteria <strong>to</strong>wards the fused result.<br />

Table X.4 Actual and recovered parameters corresponding <strong>to</strong> Table X.3.<br />

Parameter Actual Recovered<br />

Node<br />

1<br />

δ 1 0.440 0.446<br />

δ 2 1.559 1.553<br />

γ 0.255 0.341<br />

Node<br />

2<br />

Final<br />

Node<br />

δ 1 0.161 0.163<br />

δ 2 1.838 1.836<br />

γ 0.180 0.198<br />

δ 1 0.786 0.816<br />

δ 2 1.213 1.183<br />

γ 0.0846 0.028<br />

Exerc<strong>is</strong>es<br />

x. A trapezoid membership function <strong>is</strong> defined by 4 parameters a < ≤ b c < d. (If b = c, you have a<br />

triangular function). Define the equations for trapezoid and triangular fuzzy sets over a fixed interval of<br />

reals, [r, s]. Note that these membership functions do not need <strong>to</strong> be symmetric about the “center”. What<br />

happens if b < r? What about s < c?<br />

Let X = [0, 10]. Define and draw the graphs of the Trap(x; 1,2,3,5), Trap(x; 3,5,7,9), Trap(x; 8,9,9,10),<br />

Trap(x; -2,-1,3,5) and Trap(x; 8,9,11,12).


xx. A. Derive the equation for the S-function of example X.xx. Hint: there are 2 parabola pieces and for<br />

each piece you know 2 points, the value of the function at 2 of the parameters. But wait you say, a<br />

parabola has 3 coefficients, so you need at least 3 equations <strong>to</strong> find them. Think about the derivative.<br />

B. Show that the S-function has a well defined derivative at all points in its domain (even at the “join”<br />

points).<br />

C. Let X = [0, 10]. Define and graph the functions S(x; 5,7,9), Z(x; 2,3,6), Pi(x; 1,2,3,3,4,5), and Pi(x;<br />

4,5,6,7,8,10).<br />

xx. Fin<strong>is</strong>h the proof of the d<strong>is</strong>tributive law in example x.xxx.<br />

xx. Show that DeMorgan’s laws hold for the standard fuzzy set theory definitions, i.e., that<br />

( A ∪ B)<br />

c<br />

= A<br />

c<br />

∩ B<br />

c<br />

and<br />

( A ∩ B)<br />

c<br />

= A<br />

c<br />

∪ B<br />

c<br />

.<br />

xx. Show that LEM and LOC do hold for the standard fuzzy set theory definitions, show that<br />

A ∪ A<br />

c = X and A ∩ A<br />

c = φ are not true in general for fuzzy set theory. Note that X (x) = 1 and<br />

φ( x) = 0 for all x ∈ X . Hint: think about how you show that a statement <strong>is</strong> NOT a theorem.<br />

1. Suppose X = { -3, -2, -1, 0, 1, 2, 3}. Let fuzzy subsets A and B of X be defined by their membership<br />

vec<strong>to</strong>rs A = (0.0, 0.3, 0.8, 1.0, 0.8, 0.3, 0.0) and B = (1.0, 0.9, 0.7, 0.5, 0.2, 0.0, 0.0)<br />

A. Using Zadeh’s original definitions, compute<br />

A c<br />

A ∪ B<br />

A ∩ B<br />

B. What <strong>is</strong> A ∪ B if ( A ∪ B)(x)<br />

= 1 ∧ (A(x) + B(x))?<br />

2. Consider the opera<strong>to</strong>rs:


( A ∪ B)(x) = 1 ∧ (A(x) + B(x))<br />

( A ∩ B)(x) = 0 ∨ (A(x) + B(x) – 1)<br />

A c (x) = 1 – A(x)<br />

A. Show that Intersection and Complement sat<strong>is</strong>fy the Law of Contradiction.<br />

B. Is it true that Intersection <strong>is</strong> idempotent: A ∩ A = A ? (prove or give a counterexample)<br />

3. Let A(x) =<br />

⎧ 0<br />

⎪<br />

⎨x + 3<br />

⎪<br />

⎩-<br />

x -1<br />

if x < - 3 or x > -1<br />

if - 3 ≤ x ≤ - 2<br />

if - 2 < x ≤ -1<br />

and<br />

B(x) =<br />

⎧ 0<br />

⎪<br />

⎨ x -1<br />

⎪<br />

⎩-<br />

x + 3<br />

if x < 1or x > 3<br />

if 1 ≤ x ≤ 2<br />

if 2 < x ≤ 3<br />

A. Compute 0.7 A.<br />

B. Using the standard definitions, sketch a picture of B c .<br />

C. Now, let C(x) =<br />

⎧ 0<br />

⎪<br />

⎨ x - 2<br />

⎪<br />

⎩-<br />

x + 4<br />

if x < 2 or x > 4<br />

if 2 ≤ x ≤ 3<br />

if 3 < x ≤ 4<br />

Compute<br />

B ∩ C<br />

4. Briefly d<strong>is</strong>cuss th<strong>is</strong> statement: There <strong>is</strong> no need for fuzzy set theory because all uncertainty can be<br />

modeled by probability theory. (1 or 2 short paragraphs only)<br />

5. Let A = .5/a + .4/b + .7/c + .8/d + 1/e. L<strong>is</strong>t all non-empty α-cuts of A.


α β<br />

6. Let A be any fuzzy subset of X. Show that A ⊆ A if β < α.<br />

7. A. Does the Yager union and complement (same w) sat<strong>is</strong>fy the law of the excluded middle?<br />

Recall: Yager complement <strong>is</strong> A<br />

c<br />

(x) = (1−<br />

A(x)<br />

w<br />

)<br />

1/w<br />

, w ∈(0,<br />

∞)<br />

. If the answer <strong>is</strong> “yes”, prove it; if the<br />

answer <strong>is</strong> “no”, provide a specific counterexample.<br />

B. Does the Yager intersection and complement (same w) sat<strong>is</strong>fy the law of contradiction? If the<br />

answer <strong>is</strong> “yes”, prove it; if the answer <strong>is</strong> “no”, provide a specific counterexample.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!