14.06.2013 Views

Databases and Systems

Databases and Systems

Databases and Systems

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

112<br />

However, for the arbitrary N-ary fact, the objects are linked to each other<br />

conceptually like the nodes of a semantic network. The Qualifiers act somewhat like<br />

the edges in the network, but they are not fully satisfactory for this purpose given the<br />

current structure. Therefore, it is often hard to reconstruct the semantics of an N-ary<br />

fact given the data in the Associations table alone. (Even in cases where it is not hard,<br />

it generally requires several computational steps.) To avoid this (generally unneeded)<br />

computation, the "Narrative" field in the Facts table stores an explicit textual<br />

description of the N-ary fact for the user's perusal.<br />

By an analogy with Information-Retrieval methods, the Associations table may be<br />

regarded as an index (17) to the narrative text, for the purpose of rapid retrieval. The<br />

only difference between the Association table <strong>and</strong> the inverted files created by freetext<br />

indexing engines (e.g., for Web-searchable document collections) is that the<br />

index-term vocabulary is more controlled with the Associations table. The similarity,<br />

however, is that, in both cases, complex Boolean retrieval (e.g., list all neurons where<br />

Dopamine has an inhibitory role) requires set operations, such as Union. Intersection,<br />

<strong>and</strong> Difference, on subsets (projections/ selections) of the Associations table with<br />

each other. Relational set operations are computationally less efficient than the<br />

equivalent AND, OR <strong>and</strong> NOT operations that would have been needed with, say, a<br />

classical Binary-Relationship table, but the plus feature is flexibility <strong>and</strong> a simple<br />

structure. (For example, multiple object instances on a single axis do not need to be<br />

managed through separate many-to-one related tables.) Also, in practice a significant<br />

proportion of queries tend to be based on a single axis rather than multiple ones.<br />

Such queries can be answered by locating the fact IDS corresponding to a particular<br />

class instance, <strong>and</strong> then simply returning the narrative for those IDS.<br />

Managing Hierarchical Associations<br />

While the Associations table can manage arbitrary N-ary data, that does not mean<br />

that every association in the database must be stored this way. Use of the<br />

Associations table should be restricted to represent highly heterogeneous facts<br />

(where both the number <strong>and</strong> the nature of the axes vary greatly). Facts best managed<br />

in the orthodox fashion include parent-child relationships, a special category of<br />

binary relationship. These are seen quite commonly in the NS, for example, with<br />

receptors (which have subtypes), <strong>and</strong> anatomical structures (which have substructures).<br />

We have mentioned earlier the need to preprocess queries accessing hierarchical<br />

data. This way, a query specified at a coarser level of granularity must also be able to<br />

retrieve facts stored in the database at a finer level of granularity, without having to<br />

store facts redundantly at multiple granularity levels. St<strong>and</strong>ard transitive-closure<br />

algorithms for this purpose have been well-researched for the "Bill of Materials<br />

Problem" (18). Limited transitive-closure support will be provided in SQL-3 (19).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!