25.10.2016 Views

SAP HANA Predictive Analysis Library (PAL)

sap_hana_predictive_analysis_library_pal_en

sap_hana_predictive_analysis_library_pal_en

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Hence, the generated bins would be the following:<br />

Table 307:<br />

Bin<br />

Value Range<br />

Bin 1 [1, 10)<br />

Bin 2 [10, 20)<br />

Bin 3 [20, 30)<br />

Bin 4 [30, 40)<br />

Bin 5 [40, 43]<br />

●<br />

●<br />

Equal number of records per bin<br />

Assign an equal number of records to each bin.<br />

For example:<br />

○<br />

○<br />

○<br />

○<br />

○<br />

○<br />

2 bins, each containing 50% of the cases (below the median / above the median)<br />

4 bins, each containing 25% of the cases (grouped by the quartiles)<br />

5 bins, each containing 20% of the cases (grouped by the quintiles)<br />

10 bins, each containing 10% of the cases (grouped by the deciles)<br />

20 bins, each containing 5% of the cases (grouped by the vingtiles)<br />

100 bins, each containing 1% of the cases (grouped by the percentiles)<br />

A tie condition results when the values on either side of a cut point are identical. In this case we move the<br />

tied values up to the next bin.<br />

Mean / standard deviation bin boundaries<br />

The mean and standard deviation can be used to create bins which are above or below the mean. The rules<br />

are as follows:<br />

○<br />

○<br />

○<br />

+ and –1 standard deviation, so<br />

Bin 1 contains values less than –1 standard deviation from the mean<br />

Bin 2 contains values between –1 and +1 standard deviation from the mean<br />

Bin 3 contains values greater than +1 standard deviation from the mean<br />

+ and –2 standard deviation, so<br />

Bin 1 contains values less than –2*standard deviation from the mean<br />

Bin 2 contains values less than –1 standard deviation from the mean<br />

Bin 3 contains values between –1 and +1 standard deviation from the mean<br />

Bin 4 contains values greater than +1 standard deviation from the mean<br />

Bin 5 contains values greater than +2*standard deviation from the mean<br />

+ and –3 standard deviation, so<br />

Bin 1 contains values less than –3*standard deviation from the mean<br />

Bin 2 contains values less than –2*standard deviation from the mean<br />

Bin 3 contains values less than –1 standard deviation from the mean<br />

Bin 4 contains values between –1 and +1 standard deviation from the mean<br />

Bin 5 contains values greater than +1 standard deviation from the mean<br />

Bin 6 contains values greater than +2*standard deviation from the mean<br />

Bin 7 contains values greater than +3*standard deviation from the mean<br />

Smoothing Methods<br />

There are three methods for smoothing:<br />

●<br />

Smoothing by bin means: each value within a bin is replaced by the average of all the values belonging to<br />

the same bin.<br />

432 P U B L I C<br />

<strong>SAP</strong> <strong>HANA</strong> <strong>Predictive</strong> <strong>Analysis</strong> <strong>Library</strong> (<strong>PAL</strong>)<br />

<strong>PAL</strong> Functions

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!