You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
28 1<br />
We will use as a running example a hypothetical laboratory project whose purpose<br />
is to sequence <strong>and</strong> analyze a large number of cDNA clones drawn from several<br />
libraries. A database for such a project would store information about (i) the libraries<br />
being sequenced, (ii) the clones picked for sequencing, (iii) sequence-reads<br />
performed on those clones, (iv) assemblies of sequence-reads (to coalesce multiple<br />
reads from the same clone, <strong>and</strong> to detect <strong>and</strong> exploit situations in which duplicate<br />
clones are picked), <strong>and</strong> (v) analyses of assembled sequences. The system would<br />
include many components, including (i) software to control robots involved in clonepicking<br />
<strong>and</strong> preparation of sequencing templates, (ii) base calling software, (iii)<br />
software to strip vector <strong>and</strong> for quality screening of raw sequences, e.g., to detect E.<br />
coli contamination <strong>and</strong> check for repetitive elements, (iv) sequence assembly<br />
software, (v) sequence analysis software, <strong>and</strong> (vi) some means for laboratory<br />
personnel to review the results.<br />
LabBase <strong>and</strong> LabFlow are research software <strong>and</strong> are incomplete in many ways.<br />
We will endeavor to point out the major holes that we are aware of. The software is<br />
freely available <strong>and</strong> redistributable (see http://goodman.jax.org for details).<br />
LabBase Data Management System<br />
LabBase provides four main concepts for modeling laboratory (or other) databases:<br />
Objects, Materials, Steps, <strong>and</strong> States. Objects are structural objects, similar to those<br />
found in ACEDB [12], OPM [13], lore [14], UnQL [15], <strong>and</strong> many other systems.<br />
Materials are Objects that represent the identifiable things that participate in a<br />
laboratory protocol, such as libraries <strong>and</strong> clones. Steps are Objects reporting the<br />
results of a laboratory or analytical procedure, such as sequencing a clone, or running<br />
BLAST [16] on a sequence. States are Objects that represent places in a laboratory<br />
protocol, e.g., “ready for sequencing” or “ready for BLAST analysis”. We use the<br />
term object (lower-case) to refer to any kind of Object including a Material, Step, or<br />
State.<br />
The most compelling feature of LabBase is that it provides built-in support for<br />
two relationships among Materials, Steps, <strong>and</strong> States that lie at the core of typical<br />
laboratory databases. One is a relationship connecting Steps to the Materials upon<br />
which they operate. When a Step is stored in the database, LabBase automatically<br />
links the Step to its oper<strong>and</strong> Materials in a chronological history <strong>and</strong> provides a<br />
means to access Step-data directly from these Materials; for example, one can<br />
retrieve a clone’s sequence or a sequence’s BLAST analysis by querying the<br />
respective Materials rather than the Steps. The second built-in relationship connects<br />
Materials to States. When a Material is created, LabBase provides a means to place<br />
the Material in an initial State; then as Steps operating on the Material are created,<br />
the system provides a means to move the Material to the appropriate next State<br />
thereby tracking its progress through the protocol. Both of these relationships are<br />
many-to-many. We discuss these relationships further when we describe LabBase<br />
operations.