14.06.2013 Views

Databases and Systems

Databases and Systems

Databases and Systems

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

4<br />

methods of integrating maps in the erstwhile human Genome Database (GDB).<br />

Cooper et al describe their database of human mutations, an early entry in the<br />

increasingly important field of variation databases.<br />

The article by Nadkarni et al provides a link to the developing field of<br />

neuroinformatics, which is concerned with databases of such neuroscience data such<br />

as structural (MR or CT) or functional (PET, fMRI) images of the brain, histological<br />

slices, EEG <strong>and</strong> MEG data, cellular <strong>and</strong> network models, single cell recordings, <strong>and</strong><br />

so on. [15]. This article is included not only as a representative of neuroinformatic<br />

work, but because it is one of the few current neuroinformatics efforts that links the<br />

molecular scale of bioinformatics to the neurophysiological scale, since it addresses<br />

the physiology of olfaction from the receptor sequences up to cellular <strong>and</strong> network<br />

physiology.<br />

Eppig et al describe the Mouse Genome Database (MGD) <strong>and</strong> its companion<br />

system, the mouse Gene Expression Database (GXD). One of the key challenges for<br />

the next generation of databases is to begin to span the levels of organization between<br />

genotype <strong>and</strong> phenotype, where the processes of development <strong>and</strong> physiology reside.<br />

Baldock et al describe an anatomical atlas of the mouse suitable for representing<br />

spatiotemporal patterns of gene expression; the Edinburgh (Baldock et al) <strong>and</strong><br />

Jackson Laboratory (Eppig et al) projects are collaborating to link the genetic <strong>and</strong><br />

spatial databases together. The plant kingdom, which has recently experienced a<br />

rapid acceleration of genomic scrutiny in both the private <strong>and</strong> public sectors, is<br />

represented in articles on MaizeDB by Polacco <strong>and</strong> Coe <strong>and</strong> on the USDA’s<br />

Agricultural Genome Information System by Beckstrom-Sternberg <strong>and</strong> Jamison.<br />

Gelbart et al describe the rich integration of genomic <strong>and</strong> phenotypic data on<br />

Drosophila in Flybase. Mary Berlyn describes the E.coli Genetic Stock Center<br />

Database, which provides query-by-genotype access to the stock center’s extensive<br />

collection of mutant strains.<br />

The Software section contains a number of articles that address one or<br />

another aspect of the problem of integrating data from heterogenous sources. There<br />

are two common ways to achieve such integration: federation, in which the data<br />

continue to reside in separate databases but a software layer makes them act as a<br />

single integrated collection, <strong>and</strong> physical integration, often called warehousing, in<br />

which the data are combined into a single repository for querying purposes. Both<br />

approaches involve transforming the data into a common format; federation does the<br />

transformation at query time, whereas warehousing does it as a preprocessing step.<br />

One consequence of this difference is that warehouses are more difficult to keep<br />

current as the underlying databases are updated. The choice of federation vs.<br />

warehousing has performance implications as well, though they are not always easy<br />

to predict. A warehouse can map in a straightforward way to a DBMS product, <strong>and</strong><br />

make full use of the tuning <strong>and</strong> optimization capabilities of that product. Federated<br />

systems must pay the price of translating queries at run-time, possibly doing<br />

unoptimized distributed joins of query fragments across multiple databases, <strong>and</strong><br />

converting data into the st<strong>and</strong>ard form. It is also possible for federated systems to

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!