An Integrated Data Analysis Suite and Programming ... - TOBIAS-lib
An Integrated Data Analysis Suite and Programming ... - TOBIAS-lib
An Integrated Data Analysis Suite and Programming ... - TOBIAS-lib
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
2 CHAPTER 1. INTRODUCTION<br />
1.2 Approaches to Parallel DNA Sequencing<br />
<strong>An</strong> early parallel sequencing method, massively parallel signature sequencing (MPSS), was published<br />
in the year 2000 [9] <strong>and</strong> had already been available as a centralized, commercial service.<br />
However, as its short read length of only 17 to 20 nucleotides proved to be prohibitive in most<br />
potential areas of application, MPSS mainly found use as a gene expression proling assay.<br />
A technological <strong>and</strong> commercial breakthrough was marked by the deployment of integrated<br />
bench top devices through a variety of dierent providers. The rst high-throughput sequencing<br />
instrument brought to the market was the 454 Life Sciences Genome Sequencer FLX in 2005 [10].<br />
Soon after, in 2006, followed Illumina's Genome <strong>An</strong>alyzer [11] instrument <strong>and</strong> the Life Technology<br />
SOLiD system [12] in 2008.<br />
Over a short period of time, considerable technological maturation has ensued, <strong>and</strong> vendors<br />
have started to broaden their portfolios by tailoring their respective solutions to dierent operational<br />
scenarios, exemplied by the Illumina HiSeq <strong>and</strong> MiSeq devices or the 454 Life Sciences<br />
bench top model GS Junior. On the other h<strong>and</strong>, diverse alternative approaches have been proposed<br />
<strong>and</strong> introduced into the market, notably a semiconductor-based solution marketed as Ion<br />
Torrent by Life Technology [13] <strong>and</strong> the Pacic Biosciences single molecule sequencing system<br />
PacBio RS [14].<br />
As their dening feature, all high-throughput sequencing systems share the parallel interrogation<br />
of large numbers of DNA molecules. Sample DNA to be analyzed is typically applied to<br />
an expendable ow cell or ow chip, a specialized glass slide or microtiter plate particular to<br />
the respective technology. Aside from such fundamental analogies, the respective methods of sequence<br />
interrogation dier in key aspects. The basis of most current technologies is formed by the<br />
sequencing-by-synthesis (SBS) principle, with the notable exception of the SOLiD sequencingby-ligation<br />
method. Sequencing-by-synthesis decodes the sequence of DNA molecules by keeping<br />
track of nucleotides incorporated by a DNA polymerase during complementary str<strong>and</strong> synthesis,<br />
with the distinguishing feature of the dierent technological implementations being the manifold<br />
methods employed for detecting events of nucleotide incorporation.<br />
The 454 family of sequencers realize a pyrosequencing approach. The four deoxyribonucleoside<br />
triphosphates (dNTP) adenine, cytosine, guanine <strong>and</strong> thymine are owed cyclically<br />
over microtiter wells holding clonally amplied DNA templates, where each ow of reagents<br />
delivers a specic type of dNTP. The amount of pyrophosphate released during nucleotide<br />
incorporation, which is determined by the number of nucleotides incorporated in each ow, is<br />
converted into light intensity through an enzymatic reaction <strong>and</strong> recorded by a CCD camera. [10]<br />
The Illumina systems are based on reversible chain termination chemistry. A mixture of<br />
all four dNTPs is owed over a slide with r<strong>and</strong>omly distributed clusters of clonally amplied<br />
templates. The dNTPs are modied by a chain terminator, limiting elongation of the synthesized<br />
str<strong>and</strong> to a single nucleotide. Furthermore, each type of dNTP is distinguished by fusion to a<br />
specic uorophore, allowing for identication of the respective incorporated base by means<br />
of laser excitation <strong>and</strong> a CCD detector. Reagents subsequently applied displace the reversible<br />
terminator <strong>and</strong> dye to enable further str<strong>and</strong> elongation in the following cycle. [11]<br />
The Ion Torrent design adopts the homopolymer detection approach introduced by pyrosequencing.<br />
However, instead of pyrophosphate, the release of hydrogen ions serves as a proxy for<br />
str<strong>and</strong> elongation. The individual pH measurements are realized non-optically on an expendable<br />
semiconductor chip. [13]<br />
PacBio RS sequencers implement a technology termed single molecule real-time sequencing<br />
(SMRT). The method is enabled by zero-mode waveguides (ZMW), nanostructures that enable<br />
probing of volumes smaller than the wave length of light [15]. In contrast to the other current<br />
technologies described, the SMRT approach therefore does not need to rely on clonally amplied