14.06.2013 Views

Databases and Systems

Databases and Systems

Databases and Systems

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

246<br />

This central role of flat-files has several disadvantages. The most obvious problem<br />

is that flat-files are difficult to use. Writing a parser is a non-trivial task, which is<br />

further complicated by imprecisely specified <strong>and</strong> frequently changing formats.<br />

Previous attempts to establish a st<strong>and</strong>ard flat-file format have failed because<br />

biologists do not agree on a single model of their data. Another major draw back is<br />

that flat-files can lead to an immense waste of computing resources. Different<br />

programs often expect different flat-file formats so that the same site needs to keep<br />

multiple copies of the same data (for example FASTA <strong>and</strong> BLAST). Finally, the wish<br />

of human readability results in the attempt to keep all information associated with an<br />

entry together. Since this attempt conflicts with the goal of normalization, flat-files<br />

tend to be redundant. For instance the classification of a virus might be repeated in<br />

all entries from that virus.<br />

Strategy<br />

It would not be realistic to abolish the usage of flat-files in general, but it is<br />

necessary to allow for alternative solutions to overcome the problems discussed<br />

above. This can be achieved only if the dependency on flat-files <strong>and</strong> their formats is<br />

removed. The task for which flat-files are least suited - providing a programming<br />

interface to application programs - demonstrates this point. A sequence comparison<br />

program, which depends on a flat-file format, violates the principle of data<br />

independence (see for example [6]). In an ideal world it should make no difference<br />

whether an application accesses a local flat-file or a remote relational database as<br />

long as both serve the same data. The concept of interfaces helps to achieve this goal.<br />

Data source <strong>and</strong> application programs interact through interfaces, which hide the<br />

underlying implementation details. A piece of software, which can be used in<br />

different contexts through a defined interface, is called a component. But<br />

componentry can work only if used together with a widely accepted st<strong>and</strong>ard for<br />

defining interfaces <strong>and</strong> invoking methods through them. CORBA is such a st<strong>and</strong>ard.<br />

CORBA<br />

In 1989, the Object Management Group (OMG) was formed [19]. It now has more<br />

than 700 members, including practically all major software vendors, hardware<br />

vendors <strong>and</strong> large end-users. OMG’s stated goal is to st<strong>and</strong>ardize <strong>and</strong> promote object<br />

technology. The core specification adopted by the OMG is the Common Request<br />

Broker Architecture – CORBA; references [ 12] [ 14] give good introductions.<br />

CORBA combines the concept of interfaces with the object oriented distributed<br />

programming paradigm. Among other things it specifies:<br />

the Interface Definition Language (IDL), which provides a language-independent<br />

way of describing the public interface of objects.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!