20.11.2013 Views

APL-Journal - APL Germany e. V.

APL-Journal - APL Germany e. V.

APL-Journal - APL Germany e. V.

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

<strong>APL</strong>-<strong>Journal</strong><br />

We want to access the data for Texas. To do this, we need<br />

to first access the location matrix. This is conceptually a two<br />

column matrix (in the following 4 means integers):<br />

// Path to location matrix<br />

pathm = pathj + „LocMatrix.sma“;<br />

// Declare location matrix an array<br />

LocMat = SmArray.fileArrayRead(pathm,4);<br />

// Determine number of rows in location matrix<br />

int rows = LocMat.getCount() / 2;<br />

// Reshape to a matrix<br />

LocMat = LocMat.reshapeBy(rows,2);<br />

Assume locations are identified by a code and that Texas<br />

is 1 and Anyplace else is 0. The selection of location probably<br />

comes from a user interface but we set it explicitly to<br />

Texas:<br />

// Select location Texas<br />

int loc = 1;<br />

The location matrix together with the location tells us what<br />

part of the data files need to be declared as arrays.<br />

Here is code to determine the first item of data needed and the<br />

number of items:<br />

// Determine start location and extent<br />

start = LocMat.index(loc, 0).getInt();<br />

extent = SmArray.vector(LocMat.index(loc, 1));<br />

We don’t need the coverage data because we want all coverages.<br />

We do need the risk data, accident quarter, pay quarter<br />

and pay amount. Here is the code to access this data only<br />

for the desired location:<br />

// Access risk data<br />

pathm = path + „cov.sma“;<br />

Risk = SmArray.fileArrayRead(pathm, 4, extent, start);<br />

// Access accident quarter<br />

pathm = path + „accqtr.sma“;<br />

ACQTR = SmArray.fileArrayRead(pathm, 4, extent, start);<br />

// Access pay quarter<br />

pathm = path + „payqtr.sma“;<br />

PayQTR = SmArray.fileArrayRead(pathm, 4, extent, start);<br />

// Access pay amount<br />

pathm = path + „paid.sma“;<br />

Paid = SmArray.fileArrayRead(pathm, 5, extent, start);<br />

For each of the data arrays, we want only the items that<br />

correspond to Cobol programmers. Here is a mask with 1<br />

wherever the Risk is „Cob“ (assuming the data is encoded<br />

with Cobol 0, C++1, SmartArrays 2):<br />

// Select Cobol programmers<br />

mask = Risk.eq(0);<br />

Select only data for Cobol programmers:<br />

// Select only data for Cobol programmers<br />

ACQTR = mask.compress(ACQTR);<br />

PayQTR = mask.compress(PayQTR);<br />

Paid = mask.compress(Paid);<br />

For each accident quarter and pay quarter we want to sum<br />

the losses. In the following code, the numbers (4,4) could<br />

be computed values. In this example, it is known that there<br />

are 4 quarters of data. „selectUpdateBySubscript(Sm.plus“<br />

is a „Group by sum“ operation.<br />

SmArray forsum = SmArray.vector(ACQTR, PayQTR);<br />

SmArray LossT = SmArray.scalar(0).reshapeBy(4,4);<br />

LossT.selectUpdateBySubscript(Sm.plus, Paid, forsum);<br />

„LossT“ is the desired Loss triangle shown previously<br />

LossT.show();<br />

979 1661 1823 1853<br />

0 4151 5250 5295<br />

0 0 2314 2710<br />

0 0 0 335<br />

As was said before, this is not a numerically intensive computation<br />

(although SmartArrays loves big computations). It<br />

is only a computation on big data. Using some forethought<br />

in data organization and using the SmartArrays feature that<br />

allows you to declare part of a file as an array means that you<br />

can greatly reduce the amount of data that is ever looked at<br />

and achieve almost instantaneous results even with a hundred<br />

million rows of data.<br />

You could write this computation in many languages using<br />

the recipe discussed here but the array paradigm of SmartArrays<br />

allowed it to be written in very few lines of code – a<br />

few lines to access the data, some selection lines using compress<br />

to select he desired data, selectUpdateBySubscript to<br />

sum by groups. A compact program that is close to the description<br />

of the problem is easier to write, easier to read,<br />

easier to maintain, easier to enhance, and easier to understand.<br />

In this, SmartArrays emulates <strong>APL</strong>.<br />

Conclusion<br />

<strong>APL</strong> was the first real array processing language and it<br />

began the Online Analytical Processing (OLAP) industry.<br />

<strong>APL</strong>2 extended this paradigm to more complicated collections<br />

of data. SmartArrays captures and extends the <strong>APL</strong><br />

22 <strong>APL</strong> - <strong>Journal</strong> 2006, 25. Jg., Heft 1/2

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!