Advanced Data Analytics Using Python_ With Machine Learning, Deep Learning and NLP Examples ( 2023)
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Chapter 3
Supervised Learning Using Python
Intentionally Bias the Model to Over-Fit or
Under-Fit
Sometimes you need to over- or under-predict intentionally. In an
auction, when you are predicting from the buy side, it will always be good
if your bid is little lower than the original. Similarly, on the sell side, it
is desired that you set the price a little higher than the original. You can
do this in two ways. In regression, when you are selecting the features
using correlation, over- predicting intentionally drops some variable with
negative correlation. Similarly, under-predicting drops some variable with
positive correlation. There is another way of dealing with this. When you
are predicting the value, you can predict the error in the prediction. To
over-predict, when you see that the predicted error is positive, reduce the
prediction value by the error amount. Similarly, to over-predict, increase
the prediction value by the error amount when the error is positive.
Another problem in classification is biased training data. Suppose
you have two target classes, A and B. The majority (say 90 percent) of
training data is class A. So, when you train your model with this data, all
your predictions will become class A. One solution is a biased sampling
of training data. Intentionally remove the class A example from training.
Another approach can be used for binary classification. As class B is a
minority in the prediction probability of a sample, in class B it will always
be less than 0.5. Then calculate the average probability of all points coming
into class B. For any point, if the class B probability is greater than the
average probability, then mark it as class B and otherwise class A.
71