Advanced Data Analytics Using Python_ With Machine Learning, Deep Learning and NLP Examples ( 2023)
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
CHAPTER 4
Unsupervised
Learning: Clustering
In Chapter 3 we discussed how training data can be used to categorize
customer comments according to sentiment (positive, negative, neutral),
as well as according to context. For example, in the airline domain, the
context can be punctuality, food, comfort, entertainment, and so on. Using
this analysis, a business owner can determine the areas that his business
he needs to concentrate on. For instance, if he observes that the highest
percentage of negative comments has been about food, then his priority
will be the quality of food being served to the customers. However, there
are scenarios where business owners are not sure about the context. There
are also cases where training data is not available. Moreover, the frame
of reference can change with time. Classification algorithms cannot be
applied where target classes are unknown. A clustering algorithm is used
in these kinds of situations. A conventional application of clustering is
found in the wine-making industry where each cluster represents a brand
of wine, and wines are clustered according to their component ratio in
wine. In Chapter 3, you learned that classification can be used to recognize
a type of image, but there are scenarios where one image has multiple
shapes and an algorithm is needed to separate the figures. Clustering
algorithms are used in this kind of use case.
© Sayan Mukhopadhyay 2018
S. Mukhopadhyay, Advanced Data Analytics Using Python,
https://doi.org/10.1007/978-1-4842-3450-1_4
77