An Integrated Data Analysis Suite and Programming ... - TOBIAS-lib
An Integrated Data Analysis Suite and Programming ... - TOBIAS-lib
An Integrated Data Analysis Suite and Programming ... - TOBIAS-lib
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
3.2. A MODULAR SIGNAL-SLOT PROCESSING FRAMEWORK 67<br />
further denes a method dump, which causes all data provided by the incorporated Reader to be<br />
passed to the data signal before triggering the flush signal to nalize processing (line 13).<br />
For cases where application to entire data sets is not the intended use, modules must provide<br />
a means of suspending the ow of processing. Classes implementing the Reader concept may for<br />
example oer congurable st<strong>and</strong>ard ltering facilities comprising of multiple cascaded processing<br />
modules. However, while the Reader concept implies that each data set element is retrievable<br />
individually, Pipe elements may produce an arbitrary amount of output in response to input<br />
data processing, which would render such modules unsuitable for this type of reuse. We therefore<br />
introduce the additional module connections freeze <strong>and</strong> thaw serving to suspend <strong>and</strong> resume<br />
processing pipeline data ow, respectively (gure 3.2). These connections run anti-parallel to<br />
the data <strong>and</strong> flush signals. As signal invocation corresponds to a series of nested function<br />
calls transmitted along the connected modules, source <strong>and</strong> pipe templates are able to directly<br />
detect suspend <strong>and</strong> resume requests sent by downstream elements. A provided class template<br />
extractor utilizes the implemented suspend facilities to act as a bridge between the Sink <strong>and</strong><br />
Reader concepts. Thus, extractor objects may be inserted at the end of a line of processing<br />
modules to retrieve generated output data individually. A complementary template feed provides<br />
bridging between the Writer <strong>and</strong> Source concepts.<br />
3.2.3 Module Implementation<br />
Encapsulation of subproblems as modules implementing both the Writer <strong>and</strong> Reader concepts<br />
comes with the disadvantage that all data generated in response to a single invocation of one of<br />
the Writer methods must be kept in an internal buer for subsequent retrieval via the Reader<br />
interface. However, in the context of a processing pipeline each element of output data can in<br />
principle directly be transmitted to downstream modules for further processing, often avoiding<br />
the need to implement internal buering. Direct propagation can be achieved by direct implementation<br />
of the Pipe concept, thus exposing direct access to the module's data <strong>and</strong> ush<br />
signals. Implementation logic of unbuered, interruptible processing is however more complex<br />
than the buered equivalent. We therefore provide a base class template pipe_facade enforcing<br />
a restricted programming model to support correct implementation of such modules.<br />
The facade template requires distribution of processing implementation across four function<br />
calls prepare, append, next <strong>and</strong> flush. A function emit is further made accessible to subclasses<br />
to propagate generated output. The template enforces a single-emit rule by which invocation<br />
of the emit base class function is permissible exactly once during each call to either of the four<br />
implementation functions. The purpose of each of the four functions is illustrated by example<br />
(listing 3.3).<br />
Raw sequencing read input data will typically be ordered by an identier specic to the<br />
respective insert, whereas order of the individual reads retrieved from the insert is arbitrary.<br />
For some purposes further ordering of sequencing reads of the same insert by their paired-end<br />
sequencing index may however be convenient. Implementation of an appropriate reordering<br />
module demonstrates the purpose of each of the four facade functions.<br />
The reordering module collects all reads belonging to the same insert in an internal buer,<br />
<strong>and</strong> subsequently orders them by value of their paired-end index. The prepare method serves as<br />
an initial notication about next element that will be passed to the object's append method. In<br />
the case of the example module, identiers are compared to determine whether further reads are<br />
available for the insert currently being processed. If the next read does not belong to the current<br />
insert, reordering of the reads by paired-end index is triggered. Reads are then transmitted for<br />
downstream processing by invocation of emit on the rst of the collected reads. In concert with