APL-Journal - APL Germany e. V.
APL-Journal - APL Germany e. V.
APL-Journal - APL Germany e. V.
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
<strong>APL</strong>-<strong>Journal</strong><br />
We want to access the data for Texas. To do this, we need<br />
to first access the location matrix. This is conceptually a two<br />
column matrix (in the following 4 means integers):<br />
// Path to location matrix<br />
pathm = pathj + „LocMatrix.sma“;<br />
// Declare location matrix an array<br />
LocMat = SmArray.fileArrayRead(pathm,4);<br />
// Determine number of rows in location matrix<br />
int rows = LocMat.getCount() / 2;<br />
// Reshape to a matrix<br />
LocMat = LocMat.reshapeBy(rows,2);<br />
Assume locations are identified by a code and that Texas<br />
is 1 and Anyplace else is 0. The selection of location probably<br />
comes from a user interface but we set it explicitly to<br />
Texas:<br />
// Select location Texas<br />
int loc = 1;<br />
The location matrix together with the location tells us what<br />
part of the data files need to be declared as arrays.<br />
Here is code to determine the first item of data needed and the<br />
number of items:<br />
// Determine start location and extent<br />
start = LocMat.index(loc, 0).getInt();<br />
extent = SmArray.vector(LocMat.index(loc, 1));<br />
We don’t need the coverage data because we want all coverages.<br />
We do need the risk data, accident quarter, pay quarter<br />
and pay amount. Here is the code to access this data only<br />
for the desired location:<br />
// Access risk data<br />
pathm = path + „cov.sma“;<br />
Risk = SmArray.fileArrayRead(pathm, 4, extent, start);<br />
// Access accident quarter<br />
pathm = path + „accqtr.sma“;<br />
ACQTR = SmArray.fileArrayRead(pathm, 4, extent, start);<br />
// Access pay quarter<br />
pathm = path + „payqtr.sma“;<br />
PayQTR = SmArray.fileArrayRead(pathm, 4, extent, start);<br />
// Access pay amount<br />
pathm = path + „paid.sma“;<br />
Paid = SmArray.fileArrayRead(pathm, 5, extent, start);<br />
For each of the data arrays, we want only the items that<br />
correspond to Cobol programmers. Here is a mask with 1<br />
wherever the Risk is „Cob“ (assuming the data is encoded<br />
with Cobol 0, C++1, SmartArrays 2):<br />
// Select Cobol programmers<br />
mask = Risk.eq(0);<br />
Select only data for Cobol programmers:<br />
// Select only data for Cobol programmers<br />
ACQTR = mask.compress(ACQTR);<br />
PayQTR = mask.compress(PayQTR);<br />
Paid = mask.compress(Paid);<br />
For each accident quarter and pay quarter we want to sum<br />
the losses. In the following code, the numbers (4,4) could<br />
be computed values. In this example, it is known that there<br />
are 4 quarters of data. „selectUpdateBySubscript(Sm.plus“<br />
is a „Group by sum“ operation.<br />
SmArray forsum = SmArray.vector(ACQTR, PayQTR);<br />
SmArray LossT = SmArray.scalar(0).reshapeBy(4,4);<br />
LossT.selectUpdateBySubscript(Sm.plus, Paid, forsum);<br />
„LossT“ is the desired Loss triangle shown previously<br />
LossT.show();<br />
979 1661 1823 1853<br />
0 4151 5250 5295<br />
0 0 2314 2710<br />
0 0 0 335<br />
As was said before, this is not a numerically intensive computation<br />
(although SmartArrays loves big computations). It<br />
is only a computation on big data. Using some forethought<br />
in data organization and using the SmartArrays feature that<br />
allows you to declare part of a file as an array means that you<br />
can greatly reduce the amount of data that is ever looked at<br />
and achieve almost instantaneous results even with a hundred<br />
million rows of data.<br />
You could write this computation in many languages using<br />
the recipe discussed here but the array paradigm of SmartArrays<br />
allowed it to be written in very few lines of code – a<br />
few lines to access the data, some selection lines using compress<br />
to select he desired data, selectUpdateBySubscript to<br />
sum by groups. A compact program that is close to the description<br />
of the problem is easier to write, easier to read,<br />
easier to maintain, easier to enhance, and easier to understand.<br />
In this, SmartArrays emulates <strong>APL</strong>.<br />
Conclusion<br />
<strong>APL</strong> was the first real array processing language and it<br />
began the Online Analytical Processing (OLAP) industry.<br />
<strong>APL</strong>2 extended this paradigm to more complicated collections<br />
of data. SmartArrays captures and extends the <strong>APL</strong><br />
22 <strong>APL</strong> - <strong>Journal</strong> 2006, 25. Jg., Heft 1/2