08.06.2015 Views

Building Machine Learning Systems with Python - Richert, Coelho

Building Machine Learning Systems with Python - Richert, Coelho

Building Machine Learning Systems with Python - Richert, Coelho

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 6<br />

Getting to know the Bayes theorem<br />

At its core, Naive Bayes classification is nothing more than keeping track of which<br />

feature gives evidence to which class. To ease our understanding, let us assume the<br />

following meanings for the variables that we will use to explain Naive Bayes:<br />

Variable Possible values Meaning<br />

"pos", "neg" Class of a tweet (positive or negative)<br />

Non-negative<br />

integers<br />

Non-negative<br />

integers<br />

Counting the occurrence of awesome in the tweet<br />

Counting the occurrence of crazy in the tweet<br />

During training, we learn the Naive Bayes model, which is the probability for a class<br />

when we already know features and . This probability is written as .<br />

Since we cannot estimate this directly, we apply a trick, which was found out<br />

by Bayes:<br />

If we substitute <strong>with</strong> the probability of both features and occurring and think<br />

of as being our class , we arrive at the relationship that helps us to later retrieve<br />

the probability for the data instance belonging to the specified class:<br />

This allows us to express<br />

by means of the other probabilities:<br />

We could also say that:<br />

[ 119 ]

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!