27.03.2014 Views

SEKE 2012 Proceedings - Knowledge Systems Institute

SEKE 2012 Proceedings - Knowledge Systems Institute

SEKE 2012 Proceedings - Knowledge Systems Institute

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

III.<br />

VALIDATING THE DIMENSION HEADERS<br />

In [14] the dimensions including their headers are extracted<br />

according to visual clues. Nevertheless, our research found that<br />

the dimension headers are not always in the same category and<br />

hence should not be covered by the same label. Let us consider<br />

the example in Fig. 3. The group of dimension headers {Males,<br />

Females, Sex unknown, Aboriginals, Non-Aboriginals,<br />

Aboriginal identity unknown} have the same visual clues but<br />

are not from the same group, and no label can represent them<br />

as one collection.<br />

To confirm that the dimension headers are related, we need<br />

to use the numeric values in the table data region. This is done<br />

by summing the numbers associated with the dimension<br />

headers and one of the columns. In our example, the sum of the<br />

numeric values under column “2004” associated with the<br />

previously mentioned dimension headers is 316. The sum<br />

exceeds the numeric value associated with the dimension<br />

summary header, which is 158. Thus, for our purpose, this<br />

proves that it is not a collection and it is not necessary to find a<br />

label for them. On the other hand, if the sum is less than the<br />

total under every column, as in the case of the table in Fig. 2,<br />

the group of the dimension headers is acceptable as they are<br />

only a proper subset. We called this the summarization rule.<br />

Figure 3. Statistics table shows the summation problem<br />

IV. FINDING isA RELATIONSHIPS - GENERAL SCHEME<br />

In this section and in Section V, the main problem of this<br />

research is tackled: given a set of headers that belongs to a<br />

dimension { , ..., }, a label needs to be found such that an<br />

isA relationship exists between the label and , 1

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!