11.04.2024 Views

Thinking-data-science-a-data-science-practitioners-guide

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

26 2 Dimensionality Reduction

You may also do the correlation check through a program code like this:

feature_cols = df.columns[1:-1]

corr_values = df[feature_cols].corr()

indexes = np.tril_indices_from(corr_values)

for coord in zip(*indexes):

corr_values.iloc[coord[0], coord[1]] = np.NaN

corr_values = (corr_values

.stack()

.to_frame()

.reset_index()

.rename(columns={'level_0':'feature1',

'level_1':'feature2',

0:'correlation'}))

corr_values['abs_correlation'] =

corr_values.correlation.abs()

corr_values

The partial output on our dataset is shown in Fig. 2.4.

Fig. 2.4 Correlations in our dataset

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!