10.11.2016 Views

Learning Data Mining with Python

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Now, we can compute the probability of the data point belonging to this class.<br />

An important point to note is that we haven't computed P(D), so this isn't a real<br />

probability. However, it is good enough to compare against the same value for<br />

the probability of the class 1. Let's take a look at the calculation:<br />

P(C=0|D) = P(C=0) P(D|C=0)<br />

= 0.75 * 0.0756<br />

= 0.0567<br />

Now, we compute the same values for the class 1:<br />

P(C=1) = 0.25<br />

P(D) isn't needed for naive Bayes. Let's take a look at the calculation:<br />

P(D|C=1) = P(D1|C=1) x P(D2|C=1) x P(D3|C=1) x P(D4|C=1)<br />

= 0.7 x 0.7 x 0.6 x 0.9<br />

= 0.2646<br />

P(C=1|D) = P(C=1)P(D|C=1)<br />

= 0.25 * 0.2646<br />

= 0.06615<br />

Chapter 6<br />

Normally, P(C=0|D) + P(C=1|D) should equal to 1. After all, those<br />

are the only two possible options! However, the probabilities are<br />

not 1 due to the fact we haven't included the computation of P(D)<br />

in our equations here.<br />

The data point should be classified as belonging to the class 1. You may have<br />

guessed this while going through the equations anyway; however, you may have<br />

been a bit surprised that the final decision was so close. After all, the probabilities<br />

in computing P(D|C) were much, much higher for the class 1. This is because we<br />

introduced a prior belief that most samples generally belong to the class 0.<br />

If the classes had been equal sizes, the resulting probabilities would be much<br />

different. Try it yourself by changing both P(C=0) and P(C=1) to 0.5 for equal class<br />

sizes and computing the result again.<br />

[ 125 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!