14.06.2013 Views

Databases and Systems

Databases and Systems

Databases and Systems

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

20 EBI: CORBA AND THE EBI<br />

DATABASES<br />

Introduction<br />

Kim Jungfer, Graham Cameron <strong>and</strong> Tomas<br />

Flores<br />

EMBL Outstation - Hinxton, The European Bioinformatics<br />

Institute, Wellcome Trust Genome Campus, Hinxton,<br />

Cambridge CBIO lSD, United Kingdom<br />

The European Bioinformatics Institute (EBI) is a major center for biological data.<br />

Research groups have collected genome-related data for the last 15 years, during<br />

which the amount of data has grown exponentially. There are now more than 300<br />

publicly available collections of highly interrelated data. The more the size <strong>and</strong><br />

complexity of molecular biology data grow, the more important become automatic<br />

tools for management, querying <strong>and</strong> analysis. The current limitations in using this<br />

wealth of information are not due to missing technology but to lack of<br />

st<strong>and</strong>ardization. Biologists utilize every possible hardware platform, operating<br />

system, database management system <strong>and</strong> programming language. The de facto<br />

st<strong>and</strong>ard CORBA [9] [10] [11] offers the opportunity to make such differences<br />

transparent <strong>and</strong> thereby helps to combine disparate data sources <strong>and</strong> application<br />

programs.<br />

Molecular biology data have traditionally been stored in simple text files often<br />

referred to as flat-files. Flat-files are the minimalist storage mechanism, adopted for<br />

small data sets <strong>and</strong> simple programs – easily distributed <strong>and</strong> readily comprehensible.<br />

Even large volumes of complex data, although managed in database management<br />

systems, are often distributed as flat-file "entries". This leads biologists to see the<br />

flat-file as the basic data representation. The advent of the World Wide Web [2]<br />

strengthened this view. Flat files can easily be transformed to hypertext by turning<br />

references into hypertext links. The Sequence Retrieval System SRS [7] is a well-<br />

known example of this approach. Flat-files became the center of the data flow in<br />

molecular biology. Every data collection has to provide a flat-file version in order to<br />

distribute the data <strong>and</strong> most analysis programs use flat-files as their data source.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!