11.04.2024 Views

Thinking-data-science-a-data-science-practitioners-guide

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

xvi

Contents

10 Connectivity-Based Clustering ............................ 185

Agglomerative Clustering ................................. 185

In a Nutshell ........................................ 185

The Working . . . . . . . . . . . . ............................ 185

Advantages and Disadvantages . . . . . ...................... 188

Applications ......................................... 189

Implementation . . . . . . . . . . . . . . . . . . . ................... 189

Project . . ............................................. 189

Divisive Clustering ...................................... 192

In a Nutshell ........................................ 192

The Working . . . . . . . . . . . . ............................ 192

Implementation Challenges . . . . .......................... 194

Summary ............................................. 194

11 Gaussian Mixture Model ................................. 197

In a Nutshell ........................................... 197

Gaussian Distribution .................................... 197

Probability Distribution ................................... 198

Selecting Number of Clusters . . . ............................ 200

Implementation ......................................... 200

Project . . ............................................. 200

Determining Optimal Number of Clusters ...................... 202

Summary ............................................. 207

12 Density-Based Clustering ................................ 209

DBSCAN ............................................. 209

In a Nutshell ........................................ 209

Why DBSCAN? . . .................................... 210

Preliminaries . ....................................... 211

Algorithm Working . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

Advantages and Disadvantages . . . . . ...................... 212

Implementation . . . . . . . . . . . . . . . . . . . ................... 213

Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

OPTICS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216

In a Nutshell ........................................ 216

Core Distance . ....................................... 216

Reachability Distance .................................. 217

Implementation . . . . . . . . . . . . . . . . . . . ................... 218

Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

Mean Shift Clustering .................................... 222

In a Nutshell ........................................ 222

Algorithm Working . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

Implementation . . . . . . . . . . . . . . . . . . . ................... 225

Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

Summary ............................................. 228

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!