14.06.2013 Views

Databases and Systems

Databases and Systems

Databases and Systems

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

282<br />

To create a database for a specific laboratory protocol, the main tasks are to give<br />

names to the Materials <strong>and</strong> Steps of interest, <strong>and</strong> to describe the data to be reported in<br />

each Step.<br />

Steps are generally obvious, because they correspond to the actual work being<br />

done in the laboratory protocol. The main subtlety is ensuring that Steps correspond<br />

to useful points of contact between the laboratory <strong>and</strong> the computer. In our running<br />

example, possible Steps include ones reporting (i) that a library has been constructed,<br />

(ii) that a clone has been picked <strong>and</strong> plated, (iii) that sequence-template has been<br />

prepared from a clone, (iv) that sequence-template has been loaded onto a sequencing<br />

machine <strong>and</strong> run, (v) the results of a sequencing-run, e.g., base-calls, quality<br />

indicators, <strong>and</strong> chromatographs, (vi) the results of vector stripping <strong>and</strong> quality<br />

screening of sequencing results, (vii) the results of assembling sequences, <strong>and</strong> (viii)<br />

the results of analyzing sequence-assemblies.<br />

Many Materials are equally as obvious, because they correspond to the major<br />

reagents employed in the protocol, e.g., libraries <strong>and</strong> clones, or the major data<br />

produced by the protocol, e.g., sequence-reads <strong>and</strong> assemblies. As with Steps, the<br />

main danger is excess: Materials should only be defined for things that are really<br />

worth tracking. Limitations in our current software push strongly in the direction of<br />

parsimony. The mechanism mentioned above for connecting Step-data to Materials<br />

only works for Steps operating directly on a Material; it does not work transitively<br />

over related Materials. While it is easy to get the base-calls for a sequence-read, <strong>and</strong><br />

a list of all sequence-reads for a given clone, <strong>and</strong> a list of all clones picked from a<br />

library, the software offers no special help for getting all base-calls for all sequencereads<br />

for a given clone or library. A second limitation is that LabFlow (see later<br />

section) only supports workflows in which a single kind of Material marches through<br />

a protocol. The effect of these limitations is to encourage database designs in which<br />

multiple real-world material are elided into a single database-Material. In our<br />

example, it would probably be best to represent libraries as Objects (not Materials),<br />

<strong>and</strong> to merge clones <strong>and</strong> sequence-reads into one Material; assemblies would<br />

probably remain as separate Materials. The end result is a database with just two<br />

kinds of Materials: sequence-reads <strong>and</strong> sequence-assemblies.<br />

To recapitulate, the database for our running example would have just two kinds<br />

of Materials, sequence-reads <strong>and</strong> sequence-assemblies, <strong>and</strong> many kinds of Steps,<br />

each operating on one Material. One of the possible Steps listed earlier, namely, the<br />

one reporting on library construction, must fall by the wayside, since we have<br />

decided to represent libraries as Objects, not Materials; data on library construction<br />

would be stored as fields of these library Objects. The most obvious, practical<br />

shortcoming of this example database is that without a clone Material, we lose the<br />

most natural means of coordinating multiple reads from the same clone. In the<br />

database as given, one would probably coordinate multiple reads per clone in the<br />

context of sequence-assemblies; this may be workable but is certainly not ideal.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!