Sequencing

Recommendations

Info

11th Annual <strong>Sequencing</strong>, Finishing, and Analysis in the Future Meeting ESTIMATING THE EFFECTS OF REPEATS ON ASSEMBLY CONTIGUITY Thursday, 2nd June 10:00 La Fonda Ballroom Talk (OS‐4.01) Shoudan Liang, Jason Chin Pacific Biosciences of California For a perfect assembler and at a high coverage, the contiguity of the assembly at a finite read length is limited by repetitive sequences. We study the limit imposed by repeat structures in plants, and contrast it to human, as the read length is increased. We started with assembled contigs from long reads and perform an all‐against‐all alignment. Non‐unique regions of the contigs define repeats. We require each alignment to be longer than a minimum length, S. Repeats shorter than S will not align. Therefore, as the minimum overlap S is increased, we observed a decrease in the number of repeat regions. For example, for coffee genome, when the minimum allowed overlap increases from 500 to 5,000 bp, the number of distinct repetitive regions is reduced by more a factor of 10. This is partially due to long repeats being less abundant and partially because the short repeats are occurring in clusters that are seen as unique sequences in the alignment. We developed a method to separate these two effects. We show the tendency of repeats to cluster in several plant genomes. Clustered repeats are especially difficult to assemble from short reads because even when all short reads are identified to be from the same 100 kb region, they are still repetitive in the repeat‐cluster. A related method to estimate the repeats is by counting the abundance of two k‐mers separated by a fixed distance. The distance between the k‐mers is a proxy for the repeat length. This method has an advantage of potentially being directly applied to the long‐read data before assembly. We compare the direct estimate from the read with the estimate from the contigs for several plant genomes. A third way of estimating repeat abundance from long reads is by performing an all‐against‐all alignment using about 1% of data. This, when compared to the expected alignment of an idealized genome of the same size that does not have any repeatitive regions, reveals excessive alignments related to repeats at different lengths. This can be helpful in choosing assembly parameters. The method we developed is available at https://github.com/pb‐sliang/TAP. 103
11th Annual <strong>Sequencing</strong>, Finishing, and Analysis in the Future Meeting COFFEE BREAK Sponsored by 10x Genomics 10:20 – 10:40 104
Page 1 and 2:
Sequencing, Finishing, Analysis in
Page 3 and 4:
11th Annual Sequencing, Finishing,
Page 5 and 6:
xGen ® Exome Research Panel • Re
Page 7 and 8:
Page 9 and 10:
Page 11 and 12:
Page 13 and 14:
Page 15 and 16:
Page 17 and 18:
Page 19 and 20:
Page 21 and 22:
Page 23 and 24:
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
Page 43 and 44:
Page 45 and 46:
Page 47 and 48:
Page 49 and 50:
Page 51 and 52:
Page 53 and 54: 11th Annual Sequencing, Finishing,
Page 103: 11th Annual Sequencing, Finishing,
Page 155 and 156:
Page 157 and 158:
Page 159 and 160:
Page 161 and 162:
Reliable solutions for focused NGS
Page 163 and 164:
Page 165 and 166:
Page 167:
166
show all

Sequencing

Create successful ePaper yourself

Delete template?

Save as template?