09.10.2023 Views

Advanced Data Analytics Using Python_ With Machine Learning, Deep Learning and NLP Examples ( 2023)

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 3

Supervised Learning Using Python

Semisupervised Learning

Classification and regression are types of supervised learning. In this type

of learning, you have a set of training data where you train your model.

Then the model is used to predict test data. For example, suppose you

want to classify text according to sentiment. There are three target classes:

positive, negative, and neutral. To train your model, you have to choose

some sample text and label it as positive, negative, and neutral. You use

this training data to train the model. Once your model is trained, you can

apply your model to test data. For example, you may use the Naive Bayes

classifier for text classification and try to predict the sentiment of the

sentence “Food is good.” In the training phase, the program will calculate

the probability of a sentence being positive or negative or neutral when

the words Food, is, and good are presented separately and stored in the

model, and in the test phase it will calculate the joint probability when

Food, is, and good all come together. Conversely, clustering is an example

of unsupervised learning where there is no training data or target class

available. The program learns from data in one shot. There is an instance

of semisupervised learning also. Suppose you are classifying the text as

positive and negative sentiments but your training data has only positives.

The training data that is not positive is unlabeled. In this case, as the first

step, you train the model assuming all unlabeled data is negative and apply

the trained model on the training data. In the output, the data coming in

as negative should be labeled as negative. Finally, train your model with

the newly labeled data. The nearest neighbor classifier is also considered

as semisupervised learning. It has training data, but it does not have the

training phase of the model.

58

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!