An Integrated Data Analysis Suite and Programming ... - TOBIAS-lib

More documents

Recommendations

Info

1 Introduction Next-Generation Sequencing has evolved into a powerful tool for many areas of biological science, but has also introduced many new challenges related to data analysis and computing infrastructure into the eld. Following a brief recapitulation of highthroughput DNA sequencing technology, this chapter highlights important areas of application while discussing previous and related work. We outline general sequencing data analysis methodology and conclude by establishing the motivation for this work. 1.1 High-Throughput DNA Sequencing Widely deployed for less than a decade, the impact of parallel next-generation sequencing technology on genetic research has already been considerable. The novel high-throughput sequencing approaches are characterized by massive parallelism implemented in a single device. From this design results the key advantage of a signicantly reduced cost per sequenced base, which has allowed to quickly displace the previously predominant Sanger sequencing method in many areas of application. Sanger sequencing, the prevalent sequencing method for the majority of time since its introduction in 1977 [2], needed to rely on extensive automation in dedicated sequencing centers to achieve time and cost eective readout of large amounts of sequence. Most prominently, this was put into eect during the eort of generating the rst draft sequence of the human genome [36]. By contrast, high-throughput sequencing instruments are designed to analyze thousands to millions of DNA molecules simultaneously, and thus enable even smaller institutions to produce large quantities of sequence data on site. Secondary to elucidation of DNA primary structure, sequencing utilized as a random sampling device delivers quantitative clues on sample composition. In this capacity, it has been used for diverse purposes such as inference of DNA methylation levels [7] or the analysis of environmental samples [8]. In concert with the economical advantage over the dideoxynucleotide chain-termination method, parallel DNA sequencing becoming widely accessible to researchers has served to promote such deep sequencing approaches that open up whole new areas of application beyond the domain previously occupied by Sanger sequencing. As an alternative to microarray technologies, for example in gene expression proling or chromatin immunoprecipitation assays, deep sequencing overcomes fundamental technological restrictions such as probe resolution and probe saturation. Furthermore, without the inherent requirement of a-priori knowledge of sequences to be detected or quantied, development of novel protocols of application has been furthered to transform high-throughput sequencing methods into a versatile tool for research. 1
Page 1: An Integrated Data Analysis Suite a
Page 4 and 5: Erklärung Hiermit erkläre ich, da
Page 7 and 8: Acknowledgment First of all I would
Page 9: Abstract The various parallel DNA s
Page 12: xii CONTENTS 2.6 A Parallelization
Page 17 and 18: 1.3. APPLICATIONS OF HIGH-THROUGHPU
Page 19 and 20: 1.3. APPLICATIONS OF HIGH-THROUGHPU
Page 21 and 22: 1.4. PROPERTIES OF SEQUENCING DATA
Page 23 and 24: 1.5. SEQUENCING DATA ANALYSIS 9 oft
Page 25 and 26: 1.5. SEQUENCING DATA ANALYSIS 11 th
Page 27 and 28: 1.6. STORAGE AND REPRESENTATION OF
Page 29 and 30: 1.7. CONTRIBUTIONS OF THIS WORK 15
Page 33 and 34: 2 A High-Throughput DNA Sequencing
Page 35 and 36: 2.2. EFFICIENT STORAGE OF HIGH-THRO
Page 47 and 48: 2.3. A NON-DESTRUCTIVE READ FILTERI
Page 49 and 50: 2.4. A FLEXIBLE SEQUENCING READ DEM
Page 51 and 52: 2.4. A FLEXIBLE SEQUENCING READ DEM
Page 53 and 54: 2.5. VERSATILE OLIGOMER DETECTION A
Page 55 and 56: 2.5. VERSATILE OLIGOMER DETECTION A
Page 57 and 58: 2.6. A PARALLELIZATION FRONT-END FO
Page 59 and 60: 2.7. ROBUST DETECTION OF CHIP-SEQ E
Page 61 and 62: 2.7. ROBUST DETECTION OF CHIP-SEQ E
Page 63 and 64:
2.7. ROBUST DETECTION OF CHIP-SEQ E
Page 65 and 66:
2.7. ROBUST DETECTION OF CHIP-SEQ E
Page 67 and 68:
2.8. VISUALIZATION OF SEQUENCING RE
Page 69 and 70:
Page 71 and 72:
Page 75 and 76:
3 A C ++ Framework for High-Through
Page 77 and 78:
3.1. OVERVIEW 63 tation and command
Page 79 and 80:
3.2. A MODULAR SIGNAL-SLOT PROCESSI
Page 81 and 82:
Page 83 and 84:
Page 85 and 86:
3.3. A SIMPLIFIED PARALLELIZATION I
Page 87 and 88:
Page 89:
Page 92 and 93:
78 CHAPTER 4. CLOSING REMARKS algor
Page 95 and 96:
Bibliography [1] S. Ossowski. Compu
Page 97 and 98:
BIBLIOGRAPHY 83 [17] X. Zhang, J. Y
Page 99 and 100:
BIBLIOGRAPHY 85 [38] D. D. Licatalo
Page 101 and 102:
BIBLIOGRAPHY 87 [58] L. Yant, J. Ma
Page 103 and 104:
BIBLIOGRAPHY 89 [84] K. Schneeberge
Page 105 and 106:
BIBLIOGRAPHY 91 [110] P. J. Cock, C
Page 107:
BIBLIOGRAPHY 93 [138] R. Brandt, M.
show all

An Integrated Data Analysis Suite and Programming ... - TOBIAS-lib

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?