Sequencing

Recommendations

Info

11th Annual <strong>Sequencing</strong>, Finishing, and Analysis in the Future Meeting SYNERCLUST, A TRULY SCALABLE ORTHOLOG CLUSTERING TOOL Wednesday, 1st June 20:00 La Fonda NM Room (1st floor) Poster (PS‐1b.21) Christophe Georgescu 1 , Alison D Griggs 1 , Aviv Regev 1 , Ilan Wapinski 2 , Brian J Haas 1 , Ashlee Earl 1 1 Broad Institute, 2 enEvolv Accurate ortholog identification is a vital component of comparative genomic studies. Popular sequence similarity based approaches, such as OrthoMCL, struggle to cluster or‐ thologs when there are high rates of paralogs, and although phylogeneticbased methods handle paralogs, they are not sufficiently fast or scalable to work on large sets of whole genomes. Fur‐ thermore, most approaches do not take synteny into account, which means information useful for distinguishing paralogs is unused. Synergy, originally developed to work on eukaryotic species, uses a hybrid approach to resolve ortholog clusters, relying upon sequence similarity, synteny and phy‐ logeny. Here, we present Synerclust, a tool that takes the fundamentals of Synergy and adds a number of improvements that retain Synergy’s high accuracy, but makes it amenable to ortholog clustering of hundreds to thousands of whole genome data sets, representing either eukaryotic or prokaryotic species. SynerClust bypasses the all vs all Blast requirement inherent to other cluster‐ ing tools by selecting and comparing cluster representatives at each node in an input species tree. Working from tip to root, SynerClust solves and keeps track of orthology relationships, ultimately providing the most parsimonious solution that takes into account gene gains and losses, common‐ place in prokaryotes. We have also optimized SynerClust for memory usage and made it amenable for running on many different compute infrastructures. 87
11th Annual <strong>Sequencing</strong>, Finishing, and Analysis in the Future Meeting EXPLORATION OF READ DEPTH AND LENGTH FOR STRUCTURAL VARIATION DETECTION Wednesday, 1st June 20:00 La Fonda NM Room (1st floor) Poster (PS‐1b.22) Adam English, Jesse Farek, Donna Muzny, William Salerno, Eric Bowerwinkle, Richard Gibbs Baylor College of Medicine Long‐read sequencing (>1 kbp) offers more complete genomic information when compared to shortread sequencing (~100 bp), but the accuracy and relatively high cost‐per‐base limits the practicality of long reads as the sole data source in high‐throughput whole‐genome sequencing projects. An alternate, more cost‐effective strategy is to combine data types, which has been effectively implemented by de novo assembly tools including pacbioToCA and PBJelly. Here we illustrate how SV detection varies with different combinations of sequencing technologies, methods, and coverages. We first create calls from a haploid cell‐line CHM1‐tert from PBHoney (PMID: 24915764) from 40x PacBio coverage, 134x/400 bp Illumina data and an independently derived set of PacBio SVs through Parliament (PMID: 25886820), a consolidation SV discovery tool, to generate ~25,000 variant loci, ~9,000 of which are supported by short‐ and long‐read hybrid assembly. Next, using lower per‐data type coverage, we explore SV detection when applied to the diploid human HS1011 using 20x PacBio coverage (i.e., 10x per haploid genome), and multiple coverages and insert sizes of Illumina paired‐end sequencing as well as other technologies including aCGH and BioNano Irys optical mapping. These combinations show that PacBio data for evaluation expands the hybrid assembled variants by 42% and PBHoney’s PacBio discovery by an additional 46%. Finally, we evaluate the added value of long‐read data of an Ashkenazim trio with ~30x coverage for each parent and ~60x proband coverage. We find a Mendelian consistency rate of 90% for parental homozygous calls and 75% for proband homozygous calls. By exploring coverage titration points, we have quantified the impact on SV detection of specific combinations of short‐ and long‐read data. Together, these experiments suggest that robust SV detection from whole‐genome data can be achieved with hybrid read data at notably low coverages. 88
Page 1 and 2:
Sequencing, Finishing, Analysis in
Page 3 and 4:
11th Annual Sequencing, Finishing,
Page 5 and 6:
xGen ® Exome Research Panel • Re
Page 7 and 8:
Page 9 and 10:
Page 11 and 12:
Page 13 and 14:
Page 15 and 16:
Page 17 and 18:
Page 19 and 20:
Page 21 and 22:
Page 23 and 24:
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
Page 31 and 32:
Page 33 and 34:
Page 35 and 36:
Page 37 and 38: 11th Annual Sequencing, Finishing,
Page 87: 11th Annual Sequencing, Finishing,
Page 139 and 140:
Page 141 and 142:
Page 143 and 144:
Page 145 and 146:
Page 147 and 148:
Page 149 and 150:
Page 151 and 152:
Page 153 and 154:
Page 155 and 156:
Page 157 and 158:
Page 159 and 160:
Page 161 and 162:
Reliable solutions for focused NGS
Page 163 and 164:
Page 165 and 166:
Page 167:
166
show all

Sequencing

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?