28.02.2014 Views

An Integrated Data Analysis Suite and Programming ... - TOBIAS-lib

An Integrated Data Analysis Suite and Programming ... - TOBIAS-lib

An Integrated Data Analysis Suite and Programming ... - TOBIAS-lib

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

3.2. A MODULAR SIGNAL-SLOT PROCESSING FRAMEWORK 67<br />

further denes a method dump, which causes all data provided by the incorporated Reader to be<br />

passed to the data signal before triggering the flush signal to nalize processing (line 13).<br />

For cases where application to entire data sets is not the intended use, modules must provide<br />

a means of suspending the ow of processing. Classes implementing the Reader concept may for<br />

example oer congurable st<strong>and</strong>ard ltering facilities comprising of multiple cascaded processing<br />

modules. However, while the Reader concept implies that each data set element is retrievable<br />

individually, Pipe elements may produce an arbitrary amount of output in response to input<br />

data processing, which would render such modules unsuitable for this type of reuse. We therefore<br />

introduce the additional module connections freeze <strong>and</strong> thaw serving to suspend <strong>and</strong> resume<br />

processing pipeline data ow, respectively (gure 3.2). These connections run anti-parallel to<br />

the data <strong>and</strong> flush signals. As signal invocation corresponds to a series of nested function<br />

calls transmitted along the connected modules, source <strong>and</strong> pipe templates are able to directly<br />

detect suspend <strong>and</strong> resume requests sent by downstream elements. A provided class template<br />

extractor utilizes the implemented suspend facilities to act as a bridge between the Sink <strong>and</strong><br />

Reader concepts. Thus, extractor objects may be inserted at the end of a line of processing<br />

modules to retrieve generated output data individually. A complementary template feed provides<br />

bridging between the Writer <strong>and</strong> Source concepts.<br />

3.2.3 Module Implementation<br />

Encapsulation of subproblems as modules implementing both the Writer <strong>and</strong> Reader concepts<br />

comes with the disadvantage that all data generated in response to a single invocation of one of<br />

the Writer methods must be kept in an internal buer for subsequent retrieval via the Reader<br />

interface. However, in the context of a processing pipeline each element of output data can in<br />

principle directly be transmitted to downstream modules for further processing, often avoiding<br />

the need to implement internal buering. Direct propagation can be achieved by direct implementation<br />

of the Pipe concept, thus exposing direct access to the module's data <strong>and</strong> ush<br />

signals. Implementation logic of unbuered, interruptible processing is however more complex<br />

than the buered equivalent. We therefore provide a base class template pipe_facade enforcing<br />

a restricted programming model to support correct implementation of such modules.<br />

The facade template requires distribution of processing implementation across four function<br />

calls prepare, append, next <strong>and</strong> flush. A function emit is further made accessible to subclasses<br />

to propagate generated output. The template enforces a single-emit rule by which invocation<br />

of the emit base class function is permissible exactly once during each call to either of the four<br />

implementation functions. The purpose of each of the four functions is illustrated by example<br />

(listing 3.3).<br />

Raw sequencing read input data will typically be ordered by an identier specic to the<br />

respective insert, whereas order of the individual reads retrieved from the insert is arbitrary.<br />

For some purposes further ordering of sequencing reads of the same insert by their paired-end<br />

sequencing index may however be convenient. Implementation of an appropriate reordering<br />

module demonstrates the purpose of each of the four facade functions.<br />

The reordering module collects all reads belonging to the same insert in an internal buer,<br />

<strong>and</strong> subsequently orders them by value of their paired-end index. The prepare method serves as<br />

an initial notication about next element that will be passed to the object's append method. In<br />

the case of the example module, identiers are compared to determine whether further reads are<br />

available for the insert currently being processed. If the next read does not belong to the current<br />

insert, reordering of the reads by paired-end index is triggered. Reads are then transmitted for<br />

downstream processing by invocation of emit on the rst of the collected reads. In concert with

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!