Applied Biosystems SOLiD™ 4 System SETS Software User Guide ...
Applied Biosystems SOLiD™ 4 System SETS Software User Guide ...
Applied Biosystems SOLiD™ 4 System SETS Software User Guide ...
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
B<br />
<strong>Applied</strong> <strong>Biosystems</strong> SOLiD 4 <strong>System</strong> <strong>SETS</strong> <strong>Software</strong> <strong>User</strong> <strong>Guide</strong><br />
Appendix B<br />
Advanced Topic: Data Analysis<br />
Overview<br />
The topics provided in this appendix are intended for advanced users<br />
of the SOLiD 4 <strong>System</strong> and do not apply to the typical user.<br />
Fundamentals of color-space analysis<br />
The 2-base color coding scheme<br />
The <strong>Applied</strong> <strong>Biosystems</strong> SOLiD 4 <strong>System</strong> sequencing technology<br />
is based on sequential ligation of dye-labeled oligonucleotides. This<br />
technology makes possible massive parallel sequencing of clonally<br />
amplified DNA fragments. Features of this system, such as matepaired<br />
analysis and 2-base encoding, enable studies of complex<br />
genomes by providing a greater degree of accuracy. This section<br />
describes the principles of 2-base encoding and the benefits of<br />
performing analysis in the di-base alphabet, known as color-space<br />
analysis.<br />
Until recently, most DNA sequencing was performed using the chain<br />
termination method developed by Frederick Sanger. (Refer to the<br />
paper by Sanger F., Coulson A. R., 1975, A rapid method for<br />
determining sequences in DNA by primed synthesis with DNA<br />
polymerase. J Mol Biol. 94(3): 441-448.) This type of sequencing is<br />
often referred to as Sanger sequencing. Sanger sequencing data is<br />
also encoded in color-space by the four fluorescent dyes used in the<br />
sequencing chemistry and displayed as peaks in an<br />
electropherogram. In Sanger sequencing, each color, representing<br />
only a single nucleotide, is automatically translated to A, C, G, or T.<br />
With the SOLiD 4 <strong>System</strong>, each color represents four potential 2base<br />
combinations (see Figure 1). The conversion into nucleotide<br />
base space is usually done after the sequence is aligned to a reference<br />
genome transcribed in color-space. As an alternative, translation can<br />
occur following the generation of a consensus sequence.<br />
137