company
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
Comparing R, PAL, and APL Capabilities<br />
Because SAP HANA supports multiple execution<br />
engines, users can choose the best predictive<br />
library for the task – and even combine usage of<br />
multiple libraries at once. It is important to realize<br />
that R, PAL, and APL are complementary technologies<br />
to provide a choice in how to plan out projects.<br />
The following sections discuss the decision<br />
criteria for choosing which library or combination<br />
of libraries would be most appropriate for a specific<br />
use case.<br />
Note: SAP recommends deploying APL and PAL<br />
in all SAP HANA platform deployments that<br />
require predictive functionality. Customers who<br />
need additional functionality beyond what PAL<br />
provides can also deploy an R server in a sidecar<br />
configuration.<br />
ALGORITHMIC VERSUS AUTOMATED<br />
PREDICTIVE: R AND PAL VERSUS APL<br />
The choice between algorithmic (R and PAL) and<br />
automated (APL) predictive capabilities largely is<br />
dependent on the target users and their needs.<br />
The APL provides flexibility to automate the predictive<br />
analytics workflow without users needing<br />
knowledge of how to build complex data models<br />
from scratch. PAL or R typically requires a user to<br />
create procedures manually for each stage of the<br />
predictive modeling workflow.<br />
Data scientists naturally tend to prefer algorithmic<br />
techniques that offer a high degree of control<br />
and precision in the modeling process. This flexibility<br />
comes at a cost: both R and PAL require<br />
users to be properly trained in data science techniques,<br />
as they must understand what each algorithm<br />
does, how it works, and how to interpret<br />
the results. Even seasoned data scientists working<br />
on less-sophisticated analyses need to invest<br />
time to go through the full predictive analytics<br />
workflow on each problem when using algorithmic<br />
techniques.<br />
Automated analytics automates many of the<br />
predictive-modeling steps that a data scientist<br />
typically performs for common workflows like<br />
classification, regression, and association analysis,<br />
saving the user time and effort. The automated<br />
machine learning engine still performs the<br />
full predictive analytics workflow but requires very<br />
little input from the user. The result is a significantly<br />
faster analysis that has fewer configuration<br />
parameters.<br />
In general, both data scientists and business<br />
analysts should start their analysis by using the<br />
automated predictive capabilities of SAP HANA<br />
whenever possible. Automated machine learning<br />
can address a growing number of scenarios and<br />
typically can produce valid results in seconds or<br />
minutes. This enables those who are not data<br />
scientists to answer their own questions and<br />
quickly iterate on the results in a self-service<br />
manner while giving data scientists an automated<br />
way of analyzing many problems quickly.<br />
In some cases where a data scientist may want<br />
to create a more complex model or be in complete<br />
control of each algorithmic parameter, it<br />
is appropriate still to start with APL. That way,<br />
the data scientist can understand the data and<br />
create hypotheses before transitioning to an<br />
algorithmic method such as SAP HANA–native<br />
PAL or offboard R scripts.<br />
© 2016 SAP SE or an SAP affiliate <strong>company</strong>. All rights reserved.<br />
31 / 33