09.10.2023 Views

Advanced Data Analytics Using Python_ With Machine Learning, Deep Learning and NLP Examples ( 2023)

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 4

Graph Theoretical Approach

Unsupervised Learning: Clustering

The clustering problem can be mapped to a graph, where every node in

the graph is an input data point. If the distance between two graphs is

less than the threshold, then the corresponding nodes are connected.

Now using the graph partition algorithm, you can cluster the graph. One

industry example of clustering is in investment banking, where the cluster

instruments depend on the correlation of their time series of price and

performance trading of each cluster taken together. This is known as

basket trading in algorithmic trading. So, by using the similarity measure,

you can construct the graph where the nodes are instruments and the

edges between the nodes indicate that the instruments are correlated. To

create the basket, you need a set of instruments where all are correlated

to each other. In a graph, this is a set of nodes or subgraphs where all the

nodes in the subgraph are connected to each other. This kind of subgraph

is known as a clique. Finding the clique of maximum size is an NPcomplete

problem. People use heuristic solutions to solve this problem of

clustering.

How Do You Know If the Clustering Result Is

Good?

After applying the clustering algorithm, verifying the result as good or bad

is a crucial step in cluster analysis. Three parameters are used to measure

the quality of cluster, namely, centroid, radius, and diameter.

å

N

Centroid = Cm

= i=1 N

t mi

97

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!