14.06.2013 Views

Databases and Systems

Databases and Systems

Databases and Systems

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

28 1<br />

We will use as a running example a hypothetical laboratory project whose purpose<br />

is to sequence <strong>and</strong> analyze a large number of cDNA clones drawn from several<br />

libraries. A database for such a project would store information about (i) the libraries<br />

being sequenced, (ii) the clones picked for sequencing, (iii) sequence-reads<br />

performed on those clones, (iv) assemblies of sequence-reads (to coalesce multiple<br />

reads from the same clone, <strong>and</strong> to detect <strong>and</strong> exploit situations in which duplicate<br />

clones are picked), <strong>and</strong> (v) analyses of assembled sequences. The system would<br />

include many components, including (i) software to control robots involved in clonepicking<br />

<strong>and</strong> preparation of sequencing templates, (ii) base calling software, (iii)<br />

software to strip vector <strong>and</strong> for quality screening of raw sequences, e.g., to detect E.<br />

coli contamination <strong>and</strong> check for repetitive elements, (iv) sequence assembly<br />

software, (v) sequence analysis software, <strong>and</strong> (vi) some means for laboratory<br />

personnel to review the results.<br />

LabBase <strong>and</strong> LabFlow are research software <strong>and</strong> are incomplete in many ways.<br />

We will endeavor to point out the major holes that we are aware of. The software is<br />

freely available <strong>and</strong> redistributable (see http://goodman.jax.org for details).<br />

LabBase Data Management System<br />

LabBase provides four main concepts for modeling laboratory (or other) databases:<br />

Objects, Materials, Steps, <strong>and</strong> States. Objects are structural objects, similar to those<br />

found in ACEDB [12], OPM [13], lore [14], UnQL [15], <strong>and</strong> many other systems.<br />

Materials are Objects that represent the identifiable things that participate in a<br />

laboratory protocol, such as libraries <strong>and</strong> clones. Steps are Objects reporting the<br />

results of a laboratory or analytical procedure, such as sequencing a clone, or running<br />

BLAST [16] on a sequence. States are Objects that represent places in a laboratory<br />

protocol, e.g., “ready for sequencing” or “ready for BLAST analysis”. We use the<br />

term object (lower-case) to refer to any kind of Object including a Material, Step, or<br />

State.<br />

The most compelling feature of LabBase is that it provides built-in support for<br />

two relationships among Materials, Steps, <strong>and</strong> States that lie at the core of typical<br />

laboratory databases. One is a relationship connecting Steps to the Materials upon<br />

which they operate. When a Step is stored in the database, LabBase automatically<br />

links the Step to its oper<strong>and</strong> Materials in a chronological history <strong>and</strong> provides a<br />

means to access Step-data directly from these Materials; for example, one can<br />

retrieve a clone’s sequence or a sequence’s BLAST analysis by querying the<br />

respective Materials rather than the Steps. The second built-in relationship connects<br />

Materials to States. When a Material is created, LabBase provides a means to place<br />

the Material in an initial State; then as Steps operating on the Material are created,<br />

the system provides a means to move the Material to the appropriate next State<br />

thereby tracking its progress through the protocol. Both of these relationships are<br />

many-to-many. We discuss these relationships further when we describe LabBase<br />

operations.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!