14.06.2013 Views

Databases and Systems

Databases and Systems

Databases and Systems

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

290<br />

LabFlow Summary<br />

LabFlow makes it easier to create a LIMS by providing facilities for orchestrating the<br />

execution of components. Workers encapsulate the vagaries of invoking<br />

components, capturing their outputs, <strong>and</strong> coping with failures. Steps h<strong>and</strong>le database<br />

access. Routers control the order in which components execute. States keep track of<br />

the progress of Materials as they advance through the workflow. The LabFlow<br />

engine acts as a scheduler to carry out the work specified by the above elements.<br />

It would be useful to extend the system to h<strong>and</strong>le workflows involving multiple<br />

Materials. Suppose we exp<strong>and</strong> our example to include expression monitoring, in<br />

which cDNAs are arrayed on a chip <strong>and</strong> probed with different libraries. It is natural<br />

to think of this as a two dimensional process whose materials are chips (which we<br />

would like to think of as groups of cDNAs) <strong>and</strong> libraries, <strong>and</strong> whose result is a twodimensional<br />

matrix indicating expression level as a function of cDNA <strong>and</strong> library. In<br />

broad terms, the workflow for this process would include the following Steps: (1)<br />

prepare chip for probing; (2) prepare library for probing; (3) probe; (4) analyze the<br />

results. It seems intuitive to regard the chip as the Material for step (1), the library as<br />

the Material for step (2), <strong>and</strong> the pair as the Materials for steps (3) <strong>and</strong> (4). This<br />

complicates our previously simple view of Materials “flowing through” a workflow.<br />

We may need to adopt two perspectives: a sample-tracking view, in which we regard<br />

a workflow as describing how a Material travels from Step-to-Step through a process,<br />

<strong>and</strong> a task-completion view, in which we regard a workflow as describing the process<br />

needed to accomplish an arbitrary task. While more general, the latter is also more<br />

complex, <strong>and</strong> we have yet to work out all its implications.<br />

Another useful extension would be to h<strong>and</strong>le workflows involving related<br />

Materials. We earlier mentioned three important kinds of inter-Material relationships,<br />

namely, derived-from, grouping, <strong>and</strong> part/whole. If LabBase <strong>and</strong> LabFlow were<br />

extended to represent these relationships in a direct manner, it would allow us to<br />

represent the example database <strong>and</strong> workflow more naturally. Instead of combining<br />

clones <strong>and</strong> sequence-reads into a single kind of Material, each could exist<br />

independently. And instead of coordinating multiple reads per clone in the sequenceassembly<br />

workflow, we could h<strong>and</strong>le this through an explicit clone-analysis<br />

workflow. We see this as an important benefit, since the central theme of our work is<br />

to simplify LIMS construction by eliminating arcane design tasks <strong>and</strong> reducing what<br />

remains to its most natural form.<br />

Conclusion<br />

The LabBase <strong>and</strong> LabFlow systems described in this chapter are the latest in a series<br />

of systems we have built to tackle the problems of data management <strong>and</strong> workflow<br />

management for large scale biological research projects. We <strong>and</strong> our colleagues have<br />

used the predecessor systems for several projects at the Whitehead/MIT Center for

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!