14.06.2013 Views

Databases and Systems

Databases and Systems

Databases and Systems

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

216<br />

which the parser only parses the production(s) that it is asked to parse. The example<br />

can be modified so that, if you asked for the token “entry”, the parser would only go<br />

through the productions it needs, i.e. “databank” <strong>and</strong> “entry”, <strong>and</strong> would not parse<br />

“id_line”, “de_line”, etc..<br />

SRS is a flexible environment in which databank structures can be fully described<br />

<strong>and</strong> thus in which databanks can easily be added or changed. Nevertheless, in a<br />

system with many databanks, many indices (at least one per data-field of each<br />

databank) have to be created <strong>and</strong> updated. The link indices add substantially to the<br />

complexity of the maintenance of a SRS system since they manifest<br />

interdependencies between separate databanks. Fortunately, the index building<br />

process is automated. A program runs automatically at frequent intervals to check if a<br />

new version of a databank exists <strong>and</strong> performs the appropriate actions. The storage<br />

size for indices is relatively small, about 10 to 20% of the size of the actual indexed<br />

databanks.<br />

Querying <strong>and</strong> Linking<br />

On top of Icarus, a set of programs <strong>and</strong> C-API (Application Programming Interface)<br />

functions allow the interrogation of the internal Icarus representation which is<br />

superimposed to the real textual structure of databanks. Queries are expressed in the<br />

SRS Query Language which has been especially designed for the interrogation of<br />

interrelated flat file databanks. SRS queries operate on sets of entries or subentries,<br />

belonging to one or more databanks. The sets are combined using logical operators<br />

AND (‘&’), OR (‘|’) <strong>and</strong> BUTNOT (‘!’), <strong>and</strong> also two ‘link-operators’ denoted by the<br />

symbols ‘’.<br />

Oper<strong>and</strong>s include index searches of the format “[databank(s)-field:value]” where<br />

one or more databanks can be listed. For querying purposes, the sub-entries (e.g.<br />

features of a sequence) of a databank are considered as a separate databank, <strong>and</strong> can<br />

be included in this list. Queries on subentry databanks result in sets of subentries.<br />

“Field” identifies the indexed field where the search has to be performed. In the case<br />

of multiple databanks, the field must be defined for all databanks. “Value” can be a<br />

string query with wild cards (‘*’ <strong>and</strong> ‘?’), a regular expression (delimited by two<br />

forward slashes ‘/’ at the beginning <strong>and</strong> end <strong>and</strong> using ‘*’, ‘+’, ‘?’, (...), [...], etc.).<br />

Slightly different syntaxes allow numeric range or date queries. Here are some<br />

examples of queries:

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!