25.10.2016 Views

SAP HANA Predictive Analysis Library (PAL)

sap_hana_predictive_analysis_library_pal_en

sap_hana_predictive_analysis_library_pal_en

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Expected Result<br />

3.6.10 Scaling Range<br />

In real world scenarios the collected continuous attributes are usually distributed within different ranges. It is a<br />

common practice to have the data well scaled so that data mining algorithms like neural networks, nearest<br />

neighbor classification and clustering can give more reliable results.<br />

This release of <strong>PAL</strong> provides three scaling range methods described below. In the following, X ip and Y ip are the<br />

original value and transformed value of the i-th record and p-th attribute, respectively.<br />

1. Min-Max Normalization<br />

Each transformed value is within the range [new_minA, new_maxA], where new_minA and new_maxA are<br />

use-specified parameters. Supposing that minA and maxA are the minimum and maximum values of<br />

attribute A, we get the following calculation formula:<br />

Y ip = (X ip ‒ minA) × (new_maxA - new_minA) / (maxA - minA) + new_minA<br />

2. Z-Score Normalization (or zero-mean normalization).<br />

<strong>PAL</strong> uses three z-score methods.<br />

○<br />

Mean-Standard Deviation<br />

The transformed values have mean 0 and standard deviation 1. The transformation is made as follows:<br />

○<br />

Where μ p and σ p are mean and standard deviations of the original values of the p-th attributes.<br />

Mean-Mean Absolute Deviation<br />

<strong>SAP</strong> <strong>HANA</strong> <strong>Predictive</strong> <strong>Analysis</strong> <strong>Library</strong> (<strong>PAL</strong>)<br />

<strong>PAL</strong> Functions P U B L I C 479

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!