30.03.2017 Views

company

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Comparing R, PAL, and APL Capabilities<br />

Because SAP HANA supports multiple execution<br />

engines, users can choose the best predictive<br />

library for the task – and even combine usage of<br />

multiple libraries at once. It is important to realize<br />

that R, PAL, and APL are complementary technologies<br />

to provide a choice in how to plan out projects.<br />

The following sections discuss the decision<br />

criteria for choosing which library or combination<br />

of libraries would be most appropriate for a specific<br />

use case.<br />

Note: SAP recommends deploying APL and PAL<br />

in all SAP HANA platform deployments that<br />

require predictive functionality. Customers who<br />

need additional functionality beyond what PAL<br />

provides can also deploy an R server in a sidecar<br />

configuration.<br />

ALGORITHMIC VERSUS AUTOMATED<br />

PREDICTIVE: R AND PAL VERSUS APL<br />

The choice between algorithmic (R and PAL) and<br />

automated (APL) predictive capabilities largely is<br />

dependent on the target users and their needs.<br />

The APL provides flexibility to automate the predictive<br />

analytics workflow without users needing<br />

knowledge of how to build complex data models<br />

from scratch. PAL or R typically requires a user to<br />

create procedures manually for each stage of the<br />

predictive modeling workflow.<br />

Data scientists naturally tend to prefer algorithmic<br />

techniques that offer a high degree of control<br />

and precision in the modeling process. This flexibility<br />

comes at a cost: both R and PAL require<br />

users to be properly trained in data science techniques,<br />

as they must understand what each algorithm<br />

does, how it works, and how to interpret<br />

the results. Even seasoned data scientists working<br />

on less-sophisticated analyses need to invest<br />

time to go through the full predictive analytics<br />

workflow on each problem when using algorithmic<br />

techniques.<br />

Automated analytics automates many of the<br />

predictive-modeling steps that a data scientist<br />

typically performs for common workflows like<br />

classification, regression, and association analysis,<br />

saving the user time and effort. The automated<br />

machine learning engine still performs the<br />

full predictive analytics workflow but requires very<br />

little input from the user. The result is a significantly<br />

faster analysis that has fewer configuration<br />

parameters.<br />

In general, both data scientists and business<br />

analysts should start their analysis by using the<br />

automated predictive capabilities of SAP HANA<br />

whenever possible. Automated machine learning<br />

can address a growing number of scenarios and<br />

typically can produce valid results in seconds or<br />

minutes. This enables those who are not data<br />

scientists to answer their own questions and<br />

quickly iterate on the results in a self-service<br />

manner while giving data scientists an automated<br />

way of analyzing many problems quickly.<br />

In some cases where a data scientist may want<br />

to create a more complex model or be in complete<br />

control of each algorithmic parameter, it<br />

is appropriate still to start with APL. That way,<br />

the data scientist can understand the data and<br />

create hypotheses before transitioning to an<br />

algorithmic method such as SAP HANA–native<br />

PAL or offboard R scripts.<br />

© 2016 SAP SE or an SAP affiliate <strong>company</strong>. All rights reserved.<br />

31 / 33

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!