14.06.2013 Views

Databases and Systems

Databases and Systems

Databases and Systems

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

210<br />

to the structures in the integration view. For example, an integrated view for the<br />

non-human homolog search application could be one in which Loci22 <strong>and</strong> NA-<br />

Homolog-Summary, were registered as structures (the last one being a function);<br />

another could be one in which the final query was registered with some appropriate<br />

name. Note that this is a rather trivial example; in general, the integrated views will<br />

contain many structures <strong>and</strong> be much more generic.<br />

With the increasing popularity of CORBA as a st<strong>and</strong>ard for software <strong>and</strong> data<br />

sharing (http://www.omg.org/library/public-doclist.html), it is worth considering the<br />

relationship between BioKleisli <strong>and</strong> CORBA. They are similar in that they both<br />

define a type system <strong>and</strong> a st<strong>and</strong>ard syntax/format/protocol for trafficking in values<br />

adhering to that type system. However, there are some important distinctions:<br />

BioKleisli is based around a query system, <strong>and</strong> defines a rich language for collection<br />

types. Central to the system are rewrite rules <strong>and</strong> optimizations that improve the<br />

performance of queries, <strong>and</strong> isolate portions of queries that can be locally executed<br />

by the external data sources. The CORBA specification was not written with the idea<br />

of generic optimizations <strong>and</strong> rewrite rules in mind, although individual CORBA<br />

implementations may try to do various things to improve performance.<br />

However, CORBA is a powerful set of st<strong>and</strong>ards for interoperation that could be<br />

used from within BioKleisli as an external data source. That is, a CORBA driver<br />

could be written for BioKleisli, allowing BioKleisli to query any CORBA data<br />

source in addition to those it already supports. BioKleisli's type sytem is sufficiently<br />

rich to encode everything expressible in IDL. CORBA could also be used to provide<br />

a programmatic API to the BioKleisli system, allowing programmers to execute CPL<br />

queries from within a programming language of their choice. CORBA could also be<br />

used "internally" as a replacement for the current mechanism used by BioKleisli to<br />

communicate between components of the system, namely the execution engine <strong>and</strong><br />

the data drivers, as is currently done in OPM.<br />

OPM (6) is another integration toolkit that is popular within the Bioinformatics<br />

community. The primary difference between OPM <strong>and</strong> BioKleisli is the goal from<br />

which each project started: OPM focused on using a simple object model for<br />

presenting good visual interfaces to the user through which the underlying system(s)<br />

could be understood <strong>and</strong> queried. The original application was to retrofit relational<br />

databases with a more intuitive object model. BioKliesli focused on finding a<br />

complete language for complex types, <strong>and</strong> rewrite rules for optimizations. The type<br />

system <strong>and</strong> language underlying OPM is therefore not as rich as that underlying<br />

BioKleisli, nor is it as easy to add new data sources to OPM as it is to BioKleisli;<br />

however, the visual interface to OPM is much richer than that currently available in<br />

BioKleisli. An ideal system would combine the interfaces used within OPM with the<br />

query engine of BioKleisli; work in this area is underway.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!