Sequencing
SFAF2016%20Meeting%20Guide%20Final%203
SFAF2016%20Meeting%20Guide%20Final%203
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
11th Annual <strong>Sequencing</strong>, Finishing, and Analysis in the Future Meeting<br />
TOWARDS BUILDING COMPLETE GENOME<br />
ASSEMBLIES USING BIONANO NEXT-GENERATION<br />
MAPPING TECHNOLOGY<br />
Thursday, 2nd June 17:25 La Fonda Ballroom Tech Talk (TT‐2.08)<br />
Andy Wing Chun Pang 1 , Thomas Anantharaman 1 , Xiang Zhou 1 , Jian Wang 1 , Joyce Lee 1 ,<br />
Evan Eichler 2 , Tina Graves Lindsay 3 , Alex Hastie 1 , Han Cao 1<br />
1 BioNano Genomics, 2 University of Washington, 3 Washington University<br />
High‐quality assemblies are important when trying to understand the biology of genomes. Current<br />
short‐read assemblers are memory intensive and have difficulties in constructing contiguous assemblies;<br />
collecting deep coverage data by long‐read technologies can be time‐consuming and expensive.<br />
BioNano’s Next‐Generation Mapping (NGM) data complements short‐read data by flagging and<br />
correcting inaccuracies and increasing contiguity, thereby reducing the need to collect high‐coverage<br />
long‐read data.<br />
BioNano Genomics Irys® System utilizes high‐molecular‐weight DNA to construct physical genome<br />
maps. These maps can be used to reveal large structural variants, but can also be combined with<br />
sequencing assemblies to produce hybrid scaffolds of unprecedented lengths, with some spanning<br />
chromosomal ar In addition, when one aligns the sequence and BioNano assemblies, one can<br />
identify chimeric joins errors, which would appear as conflicting alignment junctions. Chimeric joins<br />
– two distal regions in the genome are incorrectly placed together by assembly algorithms may<br />
form when short reads, molecules, or paired‐end inserts are unable to span across long DNA repeats.<br />
We developed a new hybrid scaffold pipeline that detects and resolves these conflicting junctions<br />
between the sequence and the BioNano assemblies. At a conflicting junction, the pipeline uses<br />
BioNano’s long molecules to determine which assembly has been constructed incorrectly. Specifically,<br />
it checks the chimeric quality scores surrounding the conflict junction on the genome map for any<br />
evidence of a misassembly. This score indicates the percentage of BioNano molecules aligned 55<br />
kb to the left and to the right of a locus. If the junction on the genome map has low scores (less<br />
than 35%), then the genome map support would be considered relatively weak; hence, the pipeline<br />
would cut the genome map at the conflict, thus resolving the conflict. Conversely, if the genome<br />
map has high chimeric quality scores, then the sequence contig would be cut. Importantly, this<br />
automatic conflict‐resolution function can be manually modified to enable users to have fine control<br />
in generating high quality and complete hybrid scaffolds.<br />
We applied this hybrid scaffold pipeline on a haploid human genome (CHM1). The genome has been<br />
sequenced and genome mapped, and the assemblies’ N50 values are 27.7 Mb and 3.9 Mb, respectively.<br />
The pipeline resolved 23 chimeric joins in the sequence assembly and three in the BioNano genome<br />
maps. Moreover, by combining the two refined assemblies, the ultra‐long hybrid scaffolds resulted<br />
in a 58.4 Mb N50 value and 2.9 Gb in length.<br />
This new hybrid scaffold functionality further enhances the construction of highly accurate and<br />
contiguous reference assemblies for complex plants and animal genomes using BioNano mapping<br />
technology.<br />
125